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®^ (57) Abstract: The invention provides a molecular taxonomy of lung carcinoma, the leading cause of cancer death in the United 
^ States and worldwide. Oligonucleotide micro arrays were used to analyze mRNA expression levels corresponding to 12,600 tran- 
^ script sequences in 186 lung tumor saniples, including 139 adenocarcinomas resected from the lung. Hierarchical and probabilistic 
2 clustering of expression data defined distinct subclasses of lung adenocarcinoma. Among these were tumors with high relative ex- 

pression of neuroendocrine genes and of type II pneumocyte genes, respectively. Retrospective analysis revealed a less favorable 
Q outcome for the adenocarcinomas with neuroendocrine gene expression. The diagnostic potential of expression profiling is empha- 

sized by its ability to discriminate primary lung adenocarcinomas from metastases of exU-apulmonary origin. These results suggest 
^ that integration of expression profile data with clinical parameters could aid in diagnosis of lung cancer patients. 
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CLASSIFICATION OF LUNG CARCINOMAS 
USING GENE EXPRESSION ANALYSIS 

RELATED APPLICATIONS 
[0001] This application claims priority to, and the benefit of, Provisional Patent Application 
USSN 60/325/962 filed on September 28, 2001, the entire disclosure of which is incorporated 
by reference herein. 

GOVERNMENT SUPPORT 
[0002] The invention was supported, in whole or in part, by grant UOl CA84995 from the 
Nation^ Cancer Institute. The Government has certain rights in the invention. 

FIELD OF THE INVENTION 

[0003] In general, the invention relates to a gene expression based classification of lung 
cancer and a sub-classification of lung adenocarcinoma. This classification serves as a step 
towards a new molecular taxonomy of lung tumors and demonstrates the power of gene 
expression profiling in lung cancer diagnosis. 

BACKGROUND 

[0004] Carcinoma of the lung claims more than 150,000 lives every year in the United States, 
thus exceeding the combined mortaUty fix)m breast, prostate and colorectal cancers. Current 
lung cancer classification is based on clinicopathological features. Lung carcinomas are 
usually classified as small cell lung carcinomas (SCLC) or non-small cell lung carcinomas 
(NSCLC). Neuroendocrine features, defined by microscopic morphology and immuno- 
histochemistry, are halhnarks of the high-grade SCLC and large cell neuroendocrine tumors 
and of intermediate/low-grade carcinoid tumors. NSCLC is histopathologically and clinically 
distinct from SCLC, and is fiirther subcategorized as adenocarcinomas, squamous cell 
carcinomas, and large cell carcinomas, of which adenocarcinomas are the most common. 
[0005] The histopathological sub-classification of lung adenocarcinoma is challenging. In 
one study, independent lung pathologists agreed on lung adenocarcinoma sub-classification 
in only 41 % of cases. However, a favorable prognosis for bronchioloalveolar carcinoma 
(BAC), a histological sub-class of lung adenocarcinoma, argues for refining such distinctions. 
In addition, metastases of non-lung origin can be difficult to distinguish fi»m lung 
adenocarcinomas. 
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[0006] Therefore, tiiere is a need in the art for methods and compositions that are useful to 
distinguish cancer of lung origm from metastases of non-lnng origin, and to distinguish 
different types of lung cancer. 

SUMMARY 

[0007] The development of microarray methods for large-scale analysis of gene e7q>ression 
makes it possible to search systematically for molecular markers of cancer classification and 
outcome prediction in a variety of tumor types. Currently, the only effective prognostic 
indicator for NSCLC in clinical use is surgical-pathological staging. However, according to 
the invention, the simultaneous analysis of a large nxmiber of independent clinical markers 
offers a powerful adjunct approach in surgical-pathological staging. 

[0008] According to the invention, a comprehensive gene expression analysis of human lung 
tumors identified distinct lung adenocarcinoma sub-classes that were reproducibly generated 
across different cluster methods. Notably, the C2 adenocarcinoma subclass, defined by 
neuroendocrine gene expression, is associated with a less favorable outcome, while the C4 
group appears to be associated with a more favorable outcome. 

[0009] Hierarchical clustering metiiods offer a powerful approach for class discovery, but are 
less useful for determining confidence for the classes discovered. In one aspect of Ihe 
invention, a bootstrq) probabilistic clustering is combined with the hierarchical method to 
measure the strength of sample-sample association, thereby defining cluster membership with 

greater confidence. 

[0010] Although adenocarcinomas with neuroendocrine features have been reported, unique 
markers that precisely defme such tumors have not been described. In another aspect of the 
invention, putative neuroendocrine markers, for example, kallikrein 1 1, that discrimmate the 
C2 tumors from all other lung tumors, are identified. In one embodiment, this marker, which 
is related to the vasodepressor renal kallikrein, is of chnical interest given the observation of 
orthostatic hypotension in some lung cancer patients. 

[0011] In a further aspect of the invention, putative metastases of extra-puhnonary origin 
with non-lung expression signatures were discovered among presumed lung 
adenocarcinomas. According to the invention, gene expression analysis can serve as a 
diagnostic tool to confirm and identify metastases to the lung. 

[0012] In one embodunent, the invention provides limg specific marker arrays. In another 
embodiment, the invention provides limg specific marker information in computer-accessible 
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form. In other embodiments, methods and compositions of the invention are useful for drug 
selection, drug evaluation, patient prognosis, and patient monitoring. 
[0013] Diagnostic methods and anays of the invention can include all of the markers that are 
characteristic of one or more classes or subclasses of cancer described herein. Alternatively, 
single markers can be used. Preferably 1 to 20. 1 to 1 0, or about 5 genetic markers are used 
in an assay or on an assay to diagnose or detect a specific type of cancer. A single assay may 
be used to diagnose or detect one or more classes or subclasses of cancer disclosed herein. A 
useful assay includes one or more markers of one or more classes or subclasses of cancer. 
Preferred markers for different classes and subclasses of cancer are shown in Tables 1-9. 
[0014] Drag screening methods of the invention involve assaying candidate compounds or 
drugs for their effect on one or more markers of one or more difference classes or subclasses 
of cancer described herein. Preferably 1 to 20,1 to 10, or about 5 genetic markers are used in 
a screening assay to identify a drug that is effective to reduce the expression level of at least 
one of the markers. Preferred markers for different classes and subclasses of cancer are 
shown in Tables 1-9. Preferred drug candidates reduce the expression of markers associated 
with all classes of cancer. However, drug candidates that reduce the expression of markers 
associated with one or a subset of classes of cancer are also useful. Drug candidates 
identified in tiiese assays are preferably subject to clinical testing to evaluate theix 
effectiveness against different types of cancer, including different classes and subclasses of 
lung cancer. 

[00151 According to the invention, markers shown to be overexpressed in different types of 
cancer (including different classes or subclasses of lung cancer) can be used as targets for 
drug development Useful drugs include antisense nucleic acids that decrease the expression 
of one or more markets described herein. Useful drugs also include antibodies or other 
compounds tiiat interfere with the gene product of one or more markers of the invention. For 
example, a protease inhibitor that mhibits the activity of kallikrein 1 1 may be therapeutically 
usefiil. 

DESCRIPTION OF THE DRAWINGS 
[0016] Figure 1. Survival analysis of neuroendocrine C2 adenocarcinomas is shown. 
Kaplan-Meier curves for C2 versus all other adenocarcinomas. A, All patients. C2 (n = 9) 
and non-C2 (n =117). B. Patients with stage I tumors only. C2 (n = 4) and non-C2 (n = 72). 
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[00171 Figure 2. A computer system is shown. The Memory can be a RAM, ROM, 
CDROM, Tape, Disk, or other form of memory. The Removable data medium can be a 
magnetic disk, a CDROM, a tape, an optical disk, or other form of removable data medium. 
[0018] Figure 3. A box plot of median array intensity across IVT batches is shown and 
examples of uncorrected and corrected non-linear responses on same specimens following 
linear and non-linear scaling methods are also shown. 

[0019] Figure 4. Non-Unear responses in reference RNA samples are shown foUowiug linear 
scaling (a, c and e) that is corrected after rank invariant scaling (b, d and f). 
[0020] Figure 5. Pairwise agreement (R.sq values) of 12600 rank invariant scaled expression 
values of genes are shown between replicate arrays, 

[0021] Figure 6. Clusters selected by AutoClass over several runs of the algorithm are 
shown. The left panel plots the distribution over 200 runs of the algoriton on the original 
data set (experiment 1), and on the bootstr^ped data sets (experiment 2), both defined over 
675 genes. The right panel plots the corresponding distributions with respect to the data sets 
defined over 1514 genes. 

DETAILED DESCRIPTION OF THE INVENTION 
[0022] The invention provides methods and compositions for classifying lung carcinomas 
based on gene expression information. In general, the invention relates to the analysis of 
gene expression information in normal and cancerous lung tissue and the identification of 
types or classes of lung cancer based on differmt patterns of gene expression in different lung 
carcinomas. In addition, the invention provides specific markers of the different types and 
classes of lung cancer. According to the invention, markers are useful to classify and 
evaluate new lung cancers, to provide a prognosis for a Ixmg cancer patient, to identify drugs, 
and to monitor the progression of a lung cancer in a patient. 

[0023] According to the invention, gene expression can be assayed by analyzing and/or 
quantifying the nucleic acid (including mRNA, rRNA, tRNA and other RNA products of 
gene transcription) or protein (including short peptide and other protein translation products) 
products of gene expression. Methods for measuring gene expression are known in the art, 
and examples are discussed herein. However, one of ordinary skill in the art will understand 
that methods of the invention relate to all assays of gene expression in normal or diseased 
lung samples. 

[0024] In one embodiment, a gene expression analysis of 186 human carcinomas firom the 
Ixmg provides evidence for biologically distinct sub-classes of lung adenocarcinoma. 
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[00251 More fundamental knowledge of the molecular basis and classification of lung 
carcinomas is useful in the prediction of patient outcome, the informed selection of currently 
available therapies, and the identification of novel molecular targets for chemotherapy. The 
recent development of targeted therapy against the Abl tyrosine kinase for chronic myeloid 
leukemia illustrates the power of such biological knowledge. 

Molecular Classification of Diverse Lung Tumors. 
[0026] The present invention provides methods for classifying diverse lung tumors based on 
gene expression profiles. In preferred embodiments, lung tumors are classified based on the 
expression of a set of marker genes characteristic of a type of lung cancer. Inamore 
preferred embodiment, classification is based on the expression of between 1 and 50, 
preferably between 1 and 20, more preferably between 1 and 1 0, and more preferably 
between 5 and 10 maricer genes, the expression of which is strongly correlated with a type of 
lung cancer. 

[00271 First, hierarchical clustering (Eisen, M. B., Spelhnan, P. T., Brown, P. O. & 
Botstein, D. (1998) Proc Natl Acad Sci USA 95, 14863-8) was appHed to classify all 203 
samples using the 3312 most variably expressed transcripts. The resulting clusters 
recapitulated tiie distinctions b^een established histologic classes of lung tumors- 
puhnonary carcinoid tumors, SCLC, squamous cell lung carcinomas, and 
adenocarcinomasthus validating the experimental and analytic approach of the invention. 
Two-dimensional hierarchical clustering of 203 lung tumors and normal lung samples was 
performed with 3,312 transcript sequences. The expression mdex for each transcript was 
normalized. Adenocarcinomas resected fi-om the lung and a subset of adenocarcinomas 
suspected as colon metastases were analyzed. 

[00281 Normal lung samples form a distinct group, but are most similar to the 
adenocarcinomas. Marker genes that characterize normal lung samples include TGFp 
receptor type II, tetranectin and ficolin 3. A cluster of genes wifli high relation expression in 
normal lung includes: TGF-P receptor H; epithelial membrane prot. 2; PECAM-1 (CD31 
antigen); PECAM-1 (CD31 antigen); cadherin 5, type 2. VE-cadherin; AF070648; four and a 
half LIM domains 1 ; microfibrillar-associated prot 4; amine oxidase, copper containing 3; A 
kinase anchor prot. 2; ficoUn 3; receptor activity modifying prot. 2; tetiranectii^ adv. 
glycosylation end prod.-sp. receptor; TEK tyrosine kinase, endotiieUal; and sUt homolog 2. 
Elevated TGPp receptor type H levels have been previously reported for normal bronchial 
and alveolar epithelium compared to Ivmg carcinomas. 
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[0029] SCLC and carcinoid tumors both show high-level expression of neuroendocrine genes 
including insulinoma-associated gene 1 (Ball, D. W., Azzoli, C. G., Baylin, S. B., Chi, D., 
Dou, S., DonisKeller, H., Cumaraswamy, A., Borges, M. &Nel]dn,B. D. (1993) Proc Natl 
Acad Sci USA 90, 5648-52, Lan, M. S., Russell, E. K., Lu, I, Johnson, B. E. & Notions, 
A. L. (1993) Cancer Res 53, 4169-71), achaete scute homolog 1 (Ball, D. W., Azzoli, C. 
G., Baylin, S. B., Chi, D., Dou, S., DonisKeller, H., Cumaraswamy, A., Borges, M. & 
Nelkin,B. D. (1993) Proc Natl Acad Sci USA 90, 5648-52, Lan, M. S., Russell, E. K.,Lu, 
J., Johnson, B. E. &Notkins,A. L. (1993) Cancer Res 53, 4169-71), gastrin-releasing 
peptide and chromogranin A. Several previously imdescribed markers for SCLC such as 
thymosin-p and the cell cycle inhibitor pi 8'"^"*^ were also observed. A cluster of genes with 
high relative expression in neuroendocrine tumors (small cell lung cancer and pulmonary 
carcinonas) includes: tubulin, p polypeptide; insulinoma-associated 1; extra spiadle poles, 
yeast homolog; core-binding factor, (runt), a subimit 2; guanine nucleotide binding prot. 4; 
achaete-scute homolog-like 1; achaete-scute homolog-llke 1; CDKN2C (pi 8); forkhead box 
GIB; thymosin p, neuroblastoma; ISLl transcription factor; distal-less homeobon 6; 
transcription factor 12 (HTF4); PC4 and SFRSl interacting prot. 2. In one embodiment of 
the invention, only a few markers are shared between SCLC and carcinoids, while a distinct 
group of genes defines carcinoid tumors. Two-dimensional hierarchical clustering of 203 
lung tumor and normal samples (data set A) was performed with 3,312 genes as described 
herein. Different clusters of genes with high relative expressions were observed for normal 
lung; lung carcinoid; small cell lung carcinoma; squamous cell lung carcinoma; and colon 
metastasis. Clusters CI, C2,C3 and C4 were defined by clustering of data set B. This 
suggests that carcinoids are highly divergent firom maligaant lung tumors. 
[0030] Squamous cell lung carcinomas, for which diagnostic criteria include evidence of 
squamous differentiation such as keratin formation form a discrete cluster with high-level 
expression of transcripts for multiple keratin types and the keratinocytespecific protein 
stratifin. A cluster of genes with high relative expression in squamous cell lung carcinomas 
with keratin markers includes: glypican 1; collagen, type Vn, a 1 ; desmoglein 3; W27953; 
keratin 17; keratin 5; tumor prot. 63; keratin 6; ataxia-telangiectasia group D-assoc. prot.; 
serine proteinase inhibitor, clade B (5); bullous pemphigoid antigen 1; KIAA0699; 
CaN19/M87068; SlOO calcium-binding prot. A2; and galectm 7. The squamous tumors also 
show over-expression of p63, ap53-related gene essential for the formation of squamous 
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epithelia. Several adenocarcinomas that express high levels of squamous associated genes, 
also display histological evidence of squamous features. 

[00311 Finally, expression of proliferative markers, such as PCNA, thymidylate synthase, 
MCM2 and MCM6, is highest in SCLC, which is known to be the most rapidly dividing lung 
tumor A cluster of genes with high relative expression associated with proUferation includes: 
MCM2; MCM6; Rad2; flap structure-specific endonuclease 1; PCNA; thymidylate 
synthetase; DEK oncogene; H2A histone femily, member Z; high-mobiUty group prot. 2; 
and ZWIO interactor. However, unlike the other major lung tumor classes shown above, lung 
adenocarcinomas wore not defined by a unique set of marker genes. 

Class Discovery among Lung Adenocarcinomas. 

[0032] Strong signatures in other lung tumors may obscure the successfiil subclassification of 
lung adenocarcinoma in the above analysis. Therefore, a hierarchical clustering was used to 
sub-classify a data set restricted to adenocarcinomas. Classifications derived by hierarchical 
clustering and probabiUstic clustering algorithms were compared. A two-dimensional 
colored matrix was generated as a visual representation of a corresponding numerical matrix 
whose entries record a normalized measure of association strength between samples. Strong 
association approaches a value of 1 and poor association is close to 0. Associations were 
obtained for colon metastasis; normal lung; CI through C4 (adenocarcinoma clusters); 
additional groups with weaker association were also observed (groups I, II, and ffl). Genes 
expressed at high levels in specific subsets of adenocarcinomas can be clustered as a fimction 
of histologic differentiation within lung adenoma sub-classes. To avoid spurious variations 
contributing to the clustering process, 675 transcript sequences were selected with expression 
levels that were most highly reproducible in dupUcate adenocarcinoma samples, yet whose 
expression varied widely across the chosen sample set pataset B); as discussed in the 
Examples. Normal lung specimens were included in this dataset, as normal epitheUum is a 
component of tiie grossly dissected adenocarcmoma samples. 

[0033] To reduce potential classification-bias due to choice of clustering method, and to 
clarify adenocarcinoma sub-class boundaries, a model-based probabilistic clustering method 
(K:ang,Y., Prentice, M. A., Mariano, L M., Davarya, S., Linnoila, R. I., Moody, T. W., 
Wakefield, L. M. & Jakowlew, S. B. (2000) Exp Lung Res 26, 685-707) was also used. To 
assess the overall strength of each pair-wise association, the firequency with which two 
samples appeared together was measured in a cluster in 200 clustering iterations over 
bootstrap data sets. A stable cluster was defined as a set of at least 10 samples with a high 
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degree of association (a threshold of 0.45 was used, corresponding to shared cluster 
membership in at least 45% of the bootstrap datasets in which both samples were included). 
According to this definition, several clusters suggested by the hierarchical tree are stable. 
These associations can be shown, as a color matrix overlaid on a tree structure obtained firom 
hierarchical clustering. The blocks of associated samples show that both clustering methods 
recognized subclasses corresponding to normal lung and putative colon metastases (CM). 
Four subclasses of primary lung adenocarcinoma (C 1 to C4) were also observed by both 
probabilistic and hierarchical clustering. Several smaller and/or less robust groups were also 
observed (Groups I, U, and HQ. 

[0034] Probabilistic clustering also revealed correlations between samples that do not directly 
cluster together. For example, although cluster C4 falls in the right branch of the hierarchical 
dendrogram with normal lung, it shows significant association with some subclasses in the 
left dendi'ogram (groups I and HI and cluster C3) but not with other subclasses (clusters CM, 
CI, and C2). 

[0035] Clusters C2, C3, and C4 were also seen as coherent adenocarcinoma groups within 
the hierarchical clustering of the larger set of lung tumors using the 3,312 transcript sequence 
set (Dataset A). The reproducible generation of these adenocarcinoma subclasses, across 
both clustering methods and both gene sets analyzed, supports the validity of the 
adenocarcinoma clusters and their boundaries. 

[0036] In order to identify genes that best defined the proposed clusters, a supervised 
approach was used to extract marker genes firom the entire set of 12,600 transcript sequences. 
For each cluster, selected genes were the most preferentially expressed in the cluster relative 
to all other samples, using the signal-to-noise metric described previously (Golub, T. R., 
Slonim, D. K., Tamayo, P., Huard, C, Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. 
L., Downing, J. R., Caligiuri, M. A.,etal. (1999) Science 286, 5317). The genes whose 
expression correlated best with each class are usefiil as markers for class prediction of 
unlaiown lung cancer samples. 

Identification of Adenocarcinomas Metastatic to the Lung. 

[0037] The present invention provides methods for identifying metastatic tumors of non-lung 
origin. A key issue in lung tumor diagnosis is the discrimination of a primary lung 
adenocarcinoma fi:om a distant metastasis to the lung. One distinct hierarchical cluster of 12 
samples was identified that most likely represent metastatic adenocarcinomas firom the colon. 
These tumors express high levels of galectin-4, CEACAMI and liverintestinal cadherin 17, as 
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well as c-myc, which is commonly overexpressed in colon carcinoma. Genes expressed at 
high levels in colon metastases include: c-myc; ETS-2; expressed in thyroid; cadherin 17, 
(Uver-intestine); galectin-4; transmem. 4superfam. mem. 3; integrin, a 6; trypsin 4, bram; 
diacylglycerol 0-acyltransferase; E74-like factor 3 ; claudin 4; claudin 3; KIAA0792 gene 
product; CEA CAM-1; and immediate early response 3. Of the 10 samples in this group for 
which clinical history and/or histopathologic information was available, only 7 samples had 
been previously diagnosed as metastases of colonic origin. Other adenocarcinomas that 
showed nonlung signatures included AD 163, which expressed several breast-associated 
markers including estrogen receptor and mammaglobin, and was associated with a clinical 
history and histopathology consistent with breast metastasis. Also, AD368, which was not 
identified as a metastasis, expressed high levels of albumin, transferrin, and other markers 
associated with the Uver. Thus, clustering identijied suspected metastases of extra- 
puhnonary origin, including some that were previously undetected. Accordingly, methods of 
the invention can play a pivotal role for gene expression analysis in lung tumor diagnosis. 

Molecular Signature of Lung Adenocarcinoma Sub-Classes. 

[0038] The present invention also provides methods for identifying subclasses of lung 
adenocarcinoma. Hierarchical and probabilistic clustering defined four distinct sub-classes of 
primary lung adenocarcinomas. Tumors m the C 1 cluster express high levels of genes 
associated with cell division and proUferation (ubiquitin carrier prot; Cks-Hs2; high-mobility 
group prot. 2; flap structure-specific endonuclease 1; MCM6; thymidine kinase 1; PCNA; 
and W27939), some of which are also expressed in the squamous cell lung carcinoma and 
SCLC samples in Dataset A. Relatively high-level expression of proliferation-associated 
genes was also seen in cluster C2. 

[00391 Several nemxiendocrine markers, such as dopa decarboxylase and achaete-scute 
homolog 1, define cluster C2 (kallikrein 1 1; dopa decarboxylase; achaete-scute homolog-1; 
achaete-scute homolog-1; calcitonin-related polypeptide a ; proprotein convertase subtilisim 
and carboxypeptidase E) and some of these are also expressed in SCLC and puhnonary 
carcinoids. However, the serine protease, kallikrein 11, is uniquely expressed in the 
neuroendocrine C2 adenocarcmomas, and not in otiier neuroendocrine lung tumors. 
[0040] C3 tumors are defined by high-level expression of two sets of genes. Expression of 
one gene cluster (ATPase, Na+/K+ transporting; mesothelin; SlOO calcium-binding prot. P; 
solute carrier family 16; KIAA0828; phospholipase A2, group X; progastricsin (pepsinogen 
C); cytokine receptor-ike factor 1; dual specificity phosphatase 4; ornithine decarboxylase 1; 
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ornithine decarboxylase 1; TS deleted in oral cancer-related 1; ribosomal S6; sodium channel, 
nonvoltage-gated 1 a; DKFZP564O0823; glutathione S-transferase pi; glutathione S- 
transferase pi; and hepsin), including ornithine decarboxylase 1 and glutathione S-transferase 
pi, is shared with the neuroendocrine C2 cluster. Expression of the second set of genes is 
shared with cluster C4 and with nonnal lung. Genes expressed at high levels in C4, C3 and 
nonnallung include: surfactant, pulmonary-assoc. prot. B; ~N acylsphingosine 
amidohydrolase; cytochrome b-5; cytochrome b-5; deleted in liver cancer 1; Ca+ channel, 
voltage-dependent; surfactant, puhnonary-assoc. prot. C; surfactant, pulmonary-assoc. prot. 
D; AL049963; ATP-binding cassette (ABCl); KIAA0018 gene product; cathepsin H; 
selenium binding protein 1; KIAA0758; leukotriene A4 hydrolase; AF035315; leukocyte 
protease inhibitor; and BENE. Highest expression of type n alveolar pneumocyte markers, 
such as thyroid transcription factor 1 , and surfactant protein B, C and D genes, was seen in 
cluster C4, followed by nonnal lung and C3 cluster. Other markers that defined cluster C4 
included cytochrome bS, cathepsin H, and epithelial mucin 1. 

Relation between Gene Expression Tumor Classes, Histological Analysis and Smoking 
History. 

[0041] Cluster CI primarily contains poorly differentiated tumors, while C3 and C4 contains 
predominantly well-difiTerentiated tumors. Adenocarcinomas of cluster C2 fell in between. 
Ten of the 14 C4 tumors had been identified as BACs by at least one out of fliree patiiologists 
who examined the tumors; in contrast, 15 of the remaining 113 adenocarcinomas were 
similarly described as BACs, The presence of type 1 1 pneumocyte markers and the high 
firaction of putative BACs suggest that cluster C4 is likely to be a gene expression counterpart 
to BAC. All of the C4 tumors in this study were surgical-pathological stage I tumors. 
[0042] Although microscopic analysis indicated that samples varied in homogeneity, 
contamination of normal lung cells does not seem to have overwhelmed the expression 
signatures. The degree to which tumors clustered with normal samples did not reflect ttie 
percentage of tumor cells in a sample in most cases. Class C4 is most similar to nonnal lung 
in both hierarchical and probabilistic clustering, yet these tumors all revealed at least an 
estimated 50% tumor nuclei and in most samples over 80%. In contrast, classes C2 and CM 
contain tumors with as few as 30% estimated tumor nuclei but are sharply distinguishable 
firom the normal lung. Note that only ad^ocarcinoma specimen AD363, with an estimated 
30% tumor content in the adjacent section, clustered with normal lung. 
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[0043] Two adenocarcinoma sub-classes were associated with lower tobacco smoking 
histories. The presumed metastases of colon origin (CM) and C4 adenocarcinomas with type 
n pneumocyte gene expression have median smoking histories of 2.5 and 23 pack-years, 
respectively. The entire data set had a median smoking history of 40 pack-years. 

Correlation of Patient Outcome with Putative Adenocarcinoma Classes. 
[0044] The present invention also provides methods for predictmg patient outcome based on 
the analysis of lung marker gene expression. Lung cancer patient outcome was correlated 
with the sub-classes of lung adenocarcinomas dejBned herein. The neuroendocrine C2 
adenocarcinomas were associated with a less favorable survival outcome than all other 
adenocarcinomas (Fig, lA, IB). The median survival for C2 tumors was 21 months 
compared to 40.5 months for all non-C2 tumors (P = 0.00476). When only stage I tumors are 
considered, the median survival for patients with C2 tumors was 20 months compared to 47.8 
months for patients with non-C2 tumors; as the numbers are smaller, the P-value for this 
comparison is 0.0753. In contrast, C4 adenocarcmomas with type n pneumocyte gene 
expression («=14) were associated with a more favorable survival outcome than non-C4 
tumors. The median survival for patients with C4 tumors was 49.7 months while the median 
survival for patients with non-C4 tumors was 33.2 months (P = 0.049; note that the non-C2 
and non-C4 groups are different because of the exclusion of each group separately in the 
comparison). For patients with stage I tumors, the median survival in the C4 group was 49.7 
months and 43.5 months in the non-C4 group (P = 0.191). There was no detectable 
difference in prognosis between the primary lung adenocarcinomas and the metastases to the 
lung of colonic origin. 

Arrays of gene expression detection agents. 

[0045] The present invention also provides arrays of gene expression detection agents. 
Preferred gene expression detection agents hybridize specifically to marker genes disclosed 
herein. Such agents may be RNA, DNA, or PNA molecules. Preferred agents are 
oUgonucleotides. Alternative agents bind specifically to the protein expression products of 
the marker genes disclosed herein. Preferred agents include antibodies and aptamers. 
[0046] Agents, such as oligonucleotides, are preferably attached to a solid support in the 
form of an anray. OUgonucleotide arrays in the form of gene chips and useful hybridization 
assays are known in the art and disclosed for example in U.S. Patent Nos. 5,631,734; 
5,874,219; 5,861,242; 5,858,659; 5,856,174; 5,843,655; 5,837,832; 5,834,758; 5,770,722; 
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5,770,456; 5,733,729; 5,556,752; 6,045,996; and 6,261,776. In a preferred embodiment, an 
array includes oligonucleotides for measuring the expression level of markers for a specific 
type or class of lung cancer. In a more preferred embodiment, an array of the invention 
includes a plurality of oligonucleotides that are specific for marker for several types or 
classes of lung cancer or adenocarcinoma. 

Information about marker genes and marker gene expression levels. 
[0047] The present invention fiirther provides databases of marker genes and information 
about the marker genes, including the expression levels that are characteristic of different 
lung cancer types or lung adenocarcinoma subclasses. According to the invention, marker 
gene information is preferably stored in a memory in a computer system (Fig. 2). 
Alternatively, the information is stored in a removable data medium such as a magnetic disk, 
a CDROM, a tape, or an optical disk. In a further embodiment, the input/output of the 
computer system can be attached to a network and the information about the marker genes 
can be transmitted across the network. 

[0048] Preferred information includes the identity of a predetermined number of marker 
genes the expression of which correlates with a particular type of Ixmg cancCT or a particular 
subclass of adenocarcinoma. In addition, threshold expression levels of one or more marker 
genes may be stored in a memory or on a removable data medium. According to the 
invention, a threshold expression level is a level of expression of the marker gene tiiat is 
indicative of the presence of a particular type or class of lung cancer. 
[0049] In a highly preferred embodiment, a computer system or removable data medium 
includes the identity and expression information about a plurahty of marker genes for several 
types or classes of Ixmg cancer disclosed herein. In addition, information about marker genes 
for normal lung tissue may be included. 

[0050] Information stored on a computer system or data medium as described above is useful 
as a reference for comparison with expression data generated in an assay of lung tissue of 
unknown disease status. 

[0051] Finally, the present invention provides methods for identifying, evaluating, and 
monitoring dmg candidates for the treatment of different lung cancer types or 
adenocarcinoma subclasses. According to the invention, a candidate drug is assayed for its 
ability to decrease the expression of one or more markers of lung cancer. In one 
embodiment, a specific drug may reduce the expression of markers for a specific type or 
subclass of lung carcinoma described herein. Altematively, a preferred drug may have a 
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general eflfect on lung cancer and decrease the expression of different markers characteristic 
of different types or classes of lung carcinoma. In one embodiment, a preferred drag 
decreases the expression of a lung cancer marker by killing lung cancer cells or by interfering 
with their replicatioiL 

[0052] In one embodiment, the screening assays for drug candidates are performed on 
proteins encoded by the nucleic acids that are identified as having an increased expression in 
specific subclasses or types of lung carcinoma. In another embodiment, the screening assays 
for drug candidates are performed on nucleic acids that are differentially expressed in various 
subclasses or types of lung cancer when compared with normal samples. 
[0053] In one embodiment, a candidate drug is added to cells or sample tissue prior to 
analysis. Preferred cells are ceU lines grown from different types of cancer (e.g. different 
classes or subclasses of lung cancer). Alternatively, cells isolated directly from tumor tissue 
can be assayed. In another embodiment, the invention provides screens for a candidate drug 
which modulates lung cancer, modulates lung cancer gene expression and/or protein 
expression, modulates lung cancer genes or protein activity, binds to a lung cancer protein, or 
interferes with the binding of a lung cancer protein and an antibody. 
[0054] The term "candidate drug" or equivalent as used herein describes any molecule, e.g., 
an antibody, protein, oligopeptide, fatty acid, steroid, small organic molecule, polysaccharide, 
polynucleotide, antisense molecule, ligand, bioactive partner and structural analogs or 
combmations thereof to be tested for canditate drugs that are capable of directiy or indirectly 
altering the lung cancer phenotype, or the expression of one or more lung cancer markers as 
identified herein, or overaU gene and/or protein expression. Accordingly, methods of tiie 
invention include assays for monitoring the expression of nucleic acids and protein. 
[0055] Preferred assays screen for candidate drags that modulate the overaU expression of 
specific gene clusters identified herein (for exampe, one or more genes in Tables 1-9), or tiie 
expression of specific nucleic acids or proteins within the clusters. In a particularly preferred 
embodiment, as assay identified a candidate drag that suppresses a lung cancer phenotype, 
for example to a normal lung tissue phenotype. A variety of assays can be executed for drag 
screening. For example, once a specific gene is identified as bemg differentially expressed 
by the methods of the invention, candidate drags that specifically modulate expression or 
levels of the specific gene may be identified. For example, candidate drags may be identified 
that down regulate expression of the specific gene. In one embodiment, candidate drags may 
be identified that up regulate expression of the specific gene. Generally a pluraUty of assay 
mixtiires are run in parallel with different drag concentrations to obtain a differential 
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response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
[0056] The amount of gene expression can be monitored at either the gene level or the 
protein level, i.e., the amount of gene expression may be monitored using nucleic acid probes 
and methods known in the act may be used to qualify gene expression levels. Alternatively, 
the gene product itself can be monitored, for example through the use of antibodies to the 
proteins encoded by the nucleic acids identified by the methods of the invention, and in 
standard immunoassays. 

[0057] In one embodiment, candidate drugs or agents are naturally occurring proteins or 
firagments of naturally occurring proteins. Thus, for example, cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of prokaryotic and eukaryotic proteins may be made for screening by the 
methods of the invention. Particularly preferred in this embodiment are libraries of bacterial, 
fungal, viral, and mammalian proteins, with the latter being preferred, and human proteins 
being especially preferred. 

[0058] In another embodiment, candidate drugs are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By "random" or 
equivalents herein is meant that each nucleic acid and peptide consists of essentially random 
nucleotides and amino acids, respectively. Since generally these random peptides (or nucleic 
acids), are chemically synthesized, they may incorporate any nucleotide or amino acid at any 
position. The synthetic process can be designed to generate randomized proteins or nucleic 
acids, to allow the formation of all or most of the possible combinations over the length of the 
sequence, thus forming a Ubrary of randomized candidate proteinaceous drugs. 
[0059] In another embodiment, the candidate drugs are nucleic acids. As described above 
generally for proteins, nucleic acid candidate drugs may be naturally occurring nucleic acids 
or random nucleic acids. For example, digests of prokaryotic or eukaryotic genomes maybe 
used as is outlined above for proteins. 

[0060] In a preferred embodiment, nucleic acid drug candidates are antisense molecules. 
Drug candidates that are antisense molecules include antisense or sense oligonucleotides 
comprising a single-strand nucleic acid sequence (either RNA or DNA) capable of binding to 
target mRNA or DNA sequences for lung cancer molecules identified by the methods of the 
invention. For example, a preferred antisense molecule is a molecule that binds a nucleic 
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acid sequence encoding Kallikrein 1 1 . The antisense molecule can either bind a full-length 
nucleic add encoding Kallikrein 1 1 , for example the full-length DNA or mRNA encoding 
Kallikrein 1 1, or a partial nucleic acid sequence for Kallikrein 1 1 . Antisense or sense 
oUgonuclotides, typically include a fragment of generaUy about 14 nucleotides, preferably 
about 14 to 30 nucleotides. However, it is understood that the length of the antisense or sense 
nucleotides wiU depend on the length of the target nucleic acid or a fragment thereof. 
[00611 In yet another preferred embodiment, drug candidates are antibodies. An antibody 
used in methods for screening for a candidate drug may eitiier bind a full length protein or a 
fragment thereof. In a preferred embodhnent, the antibody binds a unique epitope on a target 
protein and shows little or no cross-reactivity. The term "antibody" is understood to include 
antibody fragments, as are known in the art. including Fab, Fab.sub.2, single chain antibodies 
(Fv for example), chimeric antibodies, etc., either produced by tiie modification of whole 
antibodies or those synthesized de novo using recombinant DNA technologies known in the 
art. 

[0062] Antibodies as used herein as drug candidates include botii polyclonal and monoclonal 
antibodies. Polyclonal antibodies can be raised in a mammal, for example, by one or more 
injections of an antigenic agent and, if desired, an adjuvant. It may be useful to conjugate the 
antigenic agent to a protein known to be immunogenic in the mammal being immunized. 
Preferred antigenic agents include cancer specific antigens, and more preferably lung cancer 
specific antigens. Examples of adjuvants which may be employed include Freund's complete 
adjuvant andMPL-TDM adjuvant (monophosphoryl Lipid A, synthetic frehalose 
dicorynomycolate). 

[00631 The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using various hybridoma methods known in ttie art. For example, a mouse, 
hamster, or otiier appropriate host animal, is typically immunized witii an immunizing agent 
to elicit lymphocytes tiiat produce or are capable of producing antibodies tiiat will 
specifically bind to a immunizing agent Alternatively, the lymphocytes may be immunized 
in vitro. An immunizing agent is preferably a protein or fragment thereof that differentially 
expressed in subclasses or types of lung cancer. However, otiier known cancer specific 
antigens may also be used. In a preferred embodiment, tiie immunizing agent is tiie fiall 
length Kallikrein 1 1 protem or a homolog or derivative ttiereof. In anotiier embodiment, tiie 
immunizing agent is a partial-length KaUikrem 1 1 protein or a homolog or derivative thereof. 
[00641 Panels of available antibodies may also be screened for tiieir effect on the expression 
of lung specific gene clusters (or specific genes or subsets of genes within tiiese clusters). In 
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one embodiment, some or all o flhe antibodies being screened are not known to be associated 
with any cancer specific antigen. In one embodiment, the antibodies are bispecific 
antibodies. Bispecific antibodies are monoclonal, preferably human or humanized, 
antibodies that have binding specificities for at least two different antigens. 
[0065] 

[0066] In yet another embodiment, the candidate drugs are chemical compounds. In a 
prefen*ed embodiment, the candidate dmgs are small organic compounds having a molecular 
weight of more than 100 and less than about 2500 daltons. Candidate drugs may also include 
fiinctional groups necessary for structural interaction with proteins or nucleic acids. 
[0067] According to the invention, levels of marker genes disclsosed herein can be used the 
follow the course of a lung cancer in a patient. Methods of the invention are therefore usefiil 
to evalutate the effectiveness of a particular treatment. In addition, methods of the invention 
are also usefiil to monitor the progression of a lung cancer in a patient, for example firom a C4 
to a C3 to a C2 adenocarcinoma. 

[0068] The identification of candidates that, alone or admixed with other suitable molecules, 
are competent to treat lung cancer are contemplated by the invention. Further, the production 
of commercially significant quantities of the aforementioned identified candidates, which are 
suitable for tlie prevention and/or treatment of lung, colon, or other cancer is contemplated. 
Moreover, the invention provides for the production of therapeutic grade commercially 
significant quantities of therapeutic agents in which any undesirable properties of the initially 
identified analog, such as in vivo toxicity or a tendency to degrade upon storage, are 
mitigated. 

[0069] Methods of preventing and treating cancer, after the identification of an antibody, 
peptide, peptidoniimetic, nucleic acid, or small molecule, include the step of administering a 
composition including such a compound to a patient. 

[0070] Nucleic acid molecules (including DNA, RNA, and nucleic acid analogs such as 
PNA) which are themselves active or which code for active expressed products; peptides; 
proteuis; antibodies; or other chemical compounds isolated and identified, or based upon or 
derived from Ugands isolated and identified according to the invention (also referred to as 
active compounds or drugs) can be incorporated into pharmaceutical compositions suitable 
for administration. Such active compounds or drugs include inhibitors identified or 
constracted as a result of isolating and identifying Ugands according to the invention. The 
drug compounds discovered according to the present invention can be administered to a 
mammalian host by any route. Thus, as appropriate, administration can be oral or parenteral, 
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including intravenous and intraperitoneal routes of administration. In addition, 
administration can be by periodic injections of a bolus of the drug, or can be made more 
continuous by intravenous or intraperitoneal administration from a reservoir which is extemal 
(e.g., an i.v. bag). In certain embodiments, the drugs of the instant invention can be 
therapeutic-grade. That is, certain embodiments comply with standards of purity and quaUty 
control required for administration to humans. Veterinary applications are also within the 
intended meaning as used herein. 

[0071] The formulations, both for veterinary and for human medical use, of the drugs 
according to the present invention typically include such drugs in association with a 
pharmaceutically acceptable carrier therefor and optionally other therapeutic ingredient(s). 
The canier(s) can be "acceptable" in the sense of being compatible with the other mgredients 
of the formulations and not deleterious to the recipient thereof. Pharmaceutically acceptable 
carriers, in this regard, are intended to mclude any and all solvents, dispersion media, 
coatings, antibacterial and antifingal agents, isotonic and absorption delaying agents, and the 
like, compatible with pharmaceutical administration. The use of such media and agents for 
pharmaceutically active substances is known in the art. Except insofar as any conventional 
media or agent is incompatible with the active compound, use thereof iq the compositions is 
contemplated. Supplementary active compounds (identified according to the invention 
and/or known in the art) also can be incorporated into the compositions. The formulations 
can conveniently be presented in dosage unit form and can be prepared by any of the methods 
well known in the art of pharmacy/microbiology. In general, some formulations are prepared 
by bringing the drug into association with a liquid carrier or a finely divided soUd carrio: or 
both, and thm, if necessary, shaping the product into the desired formulation. 
[0072] A pharmaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include oral or 
parenteral, e.g., mtravenous, intradermal, inhalation, transdermal (topical), transmucosal, and 
rectal administration. Solutions or suspensions used for parenteral, intradermal, or 
subcutaneous application can include the following components: a sterile diluent such as 
water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene 
glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl 
parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as 
ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents 
for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with 
acids or bases, such as hydrochloric acid or sodiiun hydroxide. 
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[0073] Useful solutions for oral or parenteral administration can be prepared by any of the 
methods well known in the pharmaceutical art, described, for example, in Remington's 
Pharmaceutical Sciences, (Gennaro, A., ed.). Mack Pub., 1990. Formulations for parenteral 
administration also can include glycocholate for buccal administration, methoxysalicylate for 
rectal admimstration, or cutric acid for vagiual adnunistration. The parenteral preparation 
can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or 
plastic. Suppositories for rectal administration also can be prepared by mixing the drug with 
a non-irritating excipient such as cocoa butter, other glycerides, or other compositions that 
are solid at room temperature and liquid at body temperatures. Formulations also can 
include, for example, polyalkylene glycols such as polyethylene glycol, oils of vegetable 
origin, hydrogenated naphthalenes, and the like. Formulations for direct administration can 
include glycerol and other compositions of high viscosity. Other potentially usefiil parenteral 
carriers for these drags include ethylene-vinyl acetate copolymer particles, osmotic pumps, 
implantable infusion systems, and liposomes. Formulations for inhalation administration can 
contain as excipients, for example, lactose, or can be aqueous solutions containing, for 
example, polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate, or oily solutions for 
admimstration in the form of nasal drops, or as a gel to be applied intranasally. Retention 
enemas also can be used for rectal delivery. 

[0074] Formulations of the present invention suitable for oral administration can be in the 
form of discrete units such as capsules, gelatin capsules, sachets, tablets, troches, or lozenges, 
each containing a predetermined amount of the drag; m the form of a powder or granules; in 
the form of a solution or a suspension in an aqueous liquid or non-aqueous liquid; or in the 
form of an oil-in-water emulsion or a water-in-oil emulsion. The drag can also be 
administered in the form of a bolus, electuary or paste. A tablet can be made by compressing 
or moulding the drag optionally with one or more accessory ingredients. Compressed tablets 
can be prepared by compressing, in a suitable machine, the drag in a free-flowing form such 
as a powder or granules, optionally mixed by a binder, lubricant, inert diluent, surface active 
or dispersing agent. Moulded tablets can be made by moulding, in a suitable machine, a 
mixture of the powdered drag and suitable carrier moistened with an inert liquid diluent. 
[0075] Oral compositions generally include an inert diluent or an edible carrier. For the 
purpose of oral therapeutic administration, the active compound can be incorporated with 
excipients. Oral compositions prepared using a fluid carrier for use as a mouthwash include 
the compound in the fluid carrier and are applied orally and swished and expectorated or 
swallowed. Pharmaceutically compatible binding agents, and/or adjuvant materials can be 
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included as part of the composition. The tablets, pills, capsules, troches and the like can 
contain any of the following ingredients, or compounds of a similar nature: a binder such as 
microcrystalline ceUulose, gum tragacanth or gelatin; an excipient such as starch or lactose; a 
disintegrating agent such as alginic acid, Primogel, or com starch; a lubricant such as 
magnesium stearate or Sterotes; a gUdant such as coUoidal silicon dioxide; a sweetening 
agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl saUcylate, 
or orange flavoring. 

[0076] Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor ELTM (BASF. 
Parsippany, NJ) or phosphate buffered saUne (PBS). In all cases, the composition can be 
sterile and can be fluid to the extent that easy syringability exists. It can be stable under the 
conditions of manufacture and storage and can be preserved against the contaminating action 
of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion 
medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene 
glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The 
proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the 
maintenance of the required particle size in the case of dispersion and by the use of 
surfectants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifimgal agents, for example, parabens, chlorobutanol. phenol, ascorbic 
acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, 
for example, sugars, polyalcohols such as manitol, sorbitol, and sodium chloride in the 
composition. Prolonged absorption of the injectable compositions can be brought about by 
including in the composition an agent which delays absorption, for example, aluminum 
monostearate and gelatin. 

[0077] Sterile injectable solutions can be prepared by incorporating the active compound in 
the required amount in an appropriate solvent with one or a combination of ingredients 
enumerated above, as required, followed by filtered sterilization. GeneraUy, dispersions are 
prepared by incorporating the active compound into a sterile vehicle which contains a basic 
dispersion medium and the required other ingredients firom those enumerated above, hi the 
case of sterile powders for the preparation of sterile injectable solutions, methods of 
preparation include vacuum drymg and freeze-drying which yields apowder of the active 
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ingredient plus any additional desired ingredient from a previously sterile-filtered solution 
thereof. 

[0078] Formulations suitable for intra-articular administration can be in the form of a sterile 
aqueous preparation of the dmg which can be in microcrystalline form, for example, in the 
form of an aqueous microcrystalline suspension. Liposomal formulations or biodegradable 
polymer systems can also be used to present the drug for both intra-articular and ophthalmic 
administration. 

(00791 Formulations suitable for topical administration include Hquid or semi-liquid 
preparations such as liniments, lotions, gels, appUcants, oil-in-water or water-in-oil emulsions 
such as creams, ointments or pasts; or solutions or suspensions such as drops. Formulations 
for topical administration to the skin surface can be prepared by dispersing the drug with a 
dermatologically acceptable carrier such as a lotion, cream, ointment or soap. In some 
embodiments, useful are carriers enable of forming a fihn or layer over the skin to localize 
apphcation and inhibit removal. Where adhesion to a tissue surface is desired the 
composition can include the dmg dispersed in a fibrinogen-thrombin composition or other 
bioadhesive. The dmg then can be painted, sprayed or otherwise applied to the desired tissue 
surface. For topical administration to internal tissue surfaces, the agent can be dispersed in a 
liquid tissue adhesive or other substance known to enhance adsorption to a tissue surface. 
For example, hydroxypropylcellulose or fibrinogen/thrombin solutions can be used to 
advantage. Altematively, tissue-coating solutions, such as pectin-containing formulations 
can be used. 

[0080] For inhalation treatments, inhalation of powder (self-propelling or spray formulations) 
dispensed with a spray can, a nebuUzer, or an atomizer can be used. Such formulations can 
be in the form of a finely comminuted powder for puhnonary administration from a powder 
inhalation device or self-propelling powder-dispensing formulations. In the case of self- 
propelling solution and spray formulations, the effect can be achieved either by choice of a 
valve having the desired spray characteristics (i.e., being capable of producing a spray having 
the desired particle size) or by incorporating the active ingredient as a suspended powder in 
controlled particle size. For administration by inhalation, the compounds also can be 
deUvered in the form of an aerosol spray from a pressured contaiaer or dispenser which 
contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebuUzer. Nasal drops 
also can be used. 

[0081] Systemic administration also can be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
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permeated are used in the formulation. Such penetrants generally are known in the art, and 
include, for example, for transmucosal administiration, detergents, bUe salts, and filsidic acid 
derivatives. Transmucosal administration can be accomplished through the use of nasal 
sprays or suppositories. For transdermal administration, the active compounds typically are 
formulated into ointments, salves, gels, or creams as generally known in the art. 
[0082] In one embodiment, the active compounds are prepared witii carriers that will protect 
the compound against rapid elimination from flie body, such as a contiroUed release 
formulation, including implants and microencapsulated deUvery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycolic acid, collagen, polyorfhoesters, and polylactic acid. Methods for preparation of 
such foimulations will be apparent to those skilled in the art The materials also can be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions can also be used as pharmaceutically acceptable earners. These can be prepared 
according to methods known to those skilled in the art, for example, as described in U.S. Pat. 
No. 4,522,81 1 . Microsomes and microparticles also can be used. 
[0083] Oral or parenteral compositions can be formulated in dosage unit form for ease of 
administration and uniformity of dosage. Dosage unit form refers to physically discrete units 
suited as unitary dosages for the subject to be treated; each unit containing a predetermined 
quantity of active compound calculated to produce tiie desired therapeutic effect in 
association witii the required pharmaceutical carrier. The specification for the dosage unit 
forms of the invention are dictated by and directiy dependent on tiie unique characteristics of 
the active compound and the particular therapeutic effect to be achieved, and the limitations 
inherent in the art of compounding such an active compound for Ihe ti:eatinent of individuals. 
[0084] Generally, the drugs identified according to the invention can be formulated for 
parenteral or oral administration to humans or other mammals, for example, in therapeutically 
effective amounts, e.g., amounts which provide appropriate concentirations of the drug to 
target tissue for a time sufficient to induce the desired effect. Additionally, the drugs of the 
present invention can be administered alone or in combination with other molecules known to 
have a beneficial effect on the particular disease or indication of interest. By way of example 
only, usefiil cofactors include symptom-alleviating cofactors, including antiseptics, 
antibiotics, antiviral and antifungal agents and analgesics and anesthetics. 
[0085] Where a peptide, peptidomimetic, small molecule or other drug identified according 
to the invention is to be used as part of a tiansplant procedure (e.g. a lung tiransplant 
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procedure), it can be provided to the Kving tissue or organ to be transplanted prior to removal 
of tissue or organ from the donor. The drug can be provided to the donor host. 
[0086] Alternatively, or in addition, once removed from the donor, the organ or living tissue 
can be placed in a preservation solution containing the drug. In all cases, the drug can be 

administered directly to the desired tissue, as by injection to the tissue, or it can be provided 
systemically, either by oral or parenteral administration, using any of the methods and 
formulations described herein and/or known in the art. 

[0087] Where the drug comprises part of a tissue or organ preservation solution, any 
commercially available preservation solution can be used to advantage. For example, useful 
solutions known in the art include Collins solution, Wisconsin solution, Belzer solution, 
Eurocollins solution and lactated Ringer's solution. Generally, an organ preservation solution 
usually possesses one or more of the following properties: (a) an osmotic pressure 
substantially equal to that of the inside of a mammalian cell (solutions typically are 
hyperosmolar and have K+ and/or Mg-H- ions present in an amount sufficient to produce an 
osmotic pressure slightly higher flian the inside of a mammalian cell); (b) the solution 
typically is capable of maintaining substantially normal ATP levels in the cells; and (c) the 
solution usually allows optimum maintenance of glucose metabolism in the cells. Organ 
preservation solutions also can contain anticoagulants, energy sources such as glucose, 
fructose and other sugars, metabohtes, heavy metal chelators, glycerol and other materials of 
high viscosity to enhance survival at low temperatures, free oxygen radical inhibiting and/or 
scavenging agents and a pH indicator. A detailed description of preservation solutions and 
useful components can be found, for example, m U.S. Pat. No. 5,002,965, the disclosure of 
which is incorporated herein by reference. 

[0088] The effective concentration of the drugs identified according to the invention that is to 
be delivered in a therapeutic composition will vary depending upon a number of factors, 
including the final desired dosage of the drug to be administered and the route of 
administration. The preferred dosage to be administered also is likely to depend on such 
variables as the type and extent of disease or indication to be. treated, the overall health status 
of the particular patient, the relative biological efficacy of the drug delivered, the formulation 
of the drug, the presence and types of excipients in the formulation, and the route of 
administration. In some embodiments, the drugs of this invention can be provided to an 
individual using typical dose units deduced from the earlier-described mammalian studies 
using non-human primates and rodents. As described above, a dosage unit refers to a imitary, 
i.e. a single dose which is capable of being administered to a patient, and which can be 
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readUy handled and packed, remaining as a physically and biologically stable unit dose 
comprising either the drug as such or a mixture of it with sohd or Uquid pharmaceutical 
diluents or carriers. 

[00891 In certain embodiments, organisms are engineered to produce drugs identified 
according to the invention. These organisms can release the drug for harvesting or can be 
introduced directiy to a patient. In another series of embodiments, cells can be utilized to 
serve as a carrier of the drugs identified according to the invention. 
[00901 The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

[0091] Drugs identified by a method of the invraition also include the prodrug derivatives of 
the compounds. The term prodrug refers to a pharmacologicaUy inactive (or partially 
inactive) derivative of a parent drug molecule that requires biotransformation, either 
spontaneous or enzymatic, vdthin the organism to release the active drug. Prodrugs are 
variations or derivatives of the compounds of the invention which have groups cleavable 
under metaboUc conditions. Prodrugs become the compounds of the invention which are 
phaimaceuticaUy active in vivo, when they undergo solvolysis under physiological conditions 
or undergo enzymatic degradation. Prodrug compounds of this invention can be called 
single, double, triple, and so on, depending on the number of biotransformation steps required 
to release the active drug within the organism, and indicating the number of functionalities 
present in a precursor-type form. Prodrug forms often offer advantages of solubility, tissue 
compatibiKty, or delayed release in the mammaHan organism (see, Bundgard, Design of 
Prodrugs, pp. 7-9, 21-24, Elsevier, Amsterdam 1985 and Silverman, The Organic Chemistry 
of Drag Design and Drug Action, pp. 352-401, Academic Press, San Diego, Cahf., 1992). 
Prodrags commonly known in the art include acid derivatives known to practitioners of the 
art, such as, for example, esters prepared by reaction of the parent acids with a suitable 
alcohol, or amides prepared by reaction of the parent acid compound with an amine, or basic 
groups reacted to form an acylated base derivative. Moreover, the prodrug derivatives of 
drugs discovered according to this mvention can be combined with other features herein 
taught to enhance bioavailability. 

[0092] Drugs as identified by the methods described herein can be administered to 
individuals to treat (prophylactically or therapeutically) various stages or subclasses of 
cancer, hi conjunction with such treatment, pharmacogenomics (i.e., tiie study of the 
relationship between an individual's genotype and that individual's response to a foreign 
compound or drug) can be considered. Differences m metabohsm of tiierapeutics can lead to 
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severe toxicity or therapeutic failure by altering the relation between dose and blood 
concentration of the pharmacologically active drug. Thus, a physician or clinician can 
consider applying knowledge obtained in relevant pharmacogenomics studies in determining 
whether to administer a drug as well as tailoring the dosage and/or therapeutic regimen of 
treatment with the drug. 

[00931 Pharmacogenomics deals with clinically signijBcant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. 
See e.g., Eichelbaum, M., Clin Exp Pharmacol Physiol, 1996, 23(10-11) :983-985 and 
Linder, M. W., Clin Chem, 1997, 43(2):254-266. In general, two types of pharmacogenetic 
conditions can be differentiated. Genetic conditions transmitted as a single factor altering the 
way drugs act on the body (altered dmg action) or genetic conditions transmitted as single 
factors altering the way the body acts on drags (altered drug metabolism). These 
pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring 
polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a 
common inherited enzymopatiiy in which the main clinical comphcation is haemolysis after 
ingestion of oxidant drags (anti-malarials, sulfonamides, analgesics, nitroflirans) and 
consumption of fava beans. 

[00941 One pharmacogenomics approach to identifying genes that predict drag response, 
known as "a genome-wide association," utiUzes a high-resolution map of the human genome 
consisting of already known gene-related markers (e.g., a "bi-alleUc" gene marker map which 
consists of 60,000-100,000 polymorphic or variable sites on the hmnan genome, each of 
which has two variants). Such a high-resolution genetic map can be compared to a map of 
the genome of each of a statistically significant number of patients taking part in a Phase 
n/m drag trial to identify markers associated with a particular observed drag response or side 
effect. Alternatively, such a hig|h resolution map can be generated from a combination of 
some ten-miUion known single nucleotide polymorphisms (SNPs) in the human genome. A 
SNP is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For 
example, a SNP can occiu: once per every 1000 bases of DNA, A SNP can be involved in a 
disease process, however, the vast majority can not be disease-associated. Given a genetic 
map based on the occurrence of such SNPs, iadividuals can be grouped into genetic 
categories depending on a particular pattern of SNPs in their individual genome. In such a 
manner, treatment regimens can be tailored to groups of genetically similar individuals, 
taking into account traits that can be common among such genetically similar individuals. 
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10095] Alternatively, a method termed the "candidate gene approach," can be utilized to 
identify genes that predict drug response. According to this method, if a gene that encodes a 
drug's target is known, all common variants of that gene can be fairly easily identified in the 
population and it can be determined if having one version of the gene versus another is 
associated with a particular drug response. 

[0096] As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of bolli the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why 
some patients do not obtain the expected drug effects or show exaggerated drug response and 
serious toxicity after taking the standard and safe dose of a drug. These polymorphisms are 
expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor 
metabolizer (PM). The prevalence of PM is different among different populations. For 
example, the gene coding for CYP2D6 is highly polymorphic and several mutations have 
been identified in PM, which all lead to the absence of functional CYP2D6. Poor 
metaboUzers of CYP2D6 and CYP2CI9 quite frequently experience exaggerated drag 
response and side effects when they receive standard doses. If a metaboUte is the active 
therapeutic moiety, PM show no therapeutic response, as demonstrated for the analgesic 
effect of codeine mediated by its CYP2D6-formed metaboUte morphine. The other extreme 
are the so called ultra-rapid metaboUzers who do not respond to standard doses. Recently, 
the molecular basis of ultra-rapid metaboUsm has been identified to be due to CYP2D6 gene 
ampUfication. Alternatively, a method termed the "gene expression profiUng," can be utilized 
to identify genes that predict drag response. For example, the gene expression of an animal 
dosed with a drag can give an indication whether gene pathways related to toxicity have been 
tumedon. 

[0097] Information generated &om more flian one of the above pharmacogenomics 
approaches can be used to determine appropriate dosage and treatment regimens for 
prophylactic or therapeutic treatment an individual. This knowledge, when appUed to dosing 
or drug selection, can avoid adverse reactions or therapeutic feilure and thus enhance 
therapeutic or prophylactic efficiency when treating a subject with a drag identified according 
to the invention. 



25 



BNSDOCID: <WO 03029273A2_L> 



wo 03/029273 



n 



PCT/US02/30797 



EXAMPLES 

Example 1: Materials and Methods 
Specimens and Datasets. 

[0098] A total of 203 snap-frozen lung tumors (n=186) and normal lung (n=17) specimens 
were used to create two datasets. Of these, 125 adenocarcinoma samples were associated 
with clinical data and with histological sUdes from adjacent sections. 
[0099] The 203 specimens (Dataset A) include histologicaUy-defined lung adenocarcinomas 
(n=127), squamous cell lung carcinomas (n=21), puhnonary carcinoids (n=20), SCLC (n=6) 
cases and normal lung (n=17) specimens. Other adenocarcinomas (n=12) were suspected to 
he extrapuhnonary metastases based on clinical history. Dataset B, a subset of Dataset A, 
includes only adenocarcinomas and normal lung samples. 

Tumor Bank, Clinical Information, and Pathological Analysis 

[00100] The complete cohort for these studies consists of 203 patient samples that can 
be broken down into 139 lung adenocarcinomas (AD) that included 12 suspected metastases 
of extrapuhnonary origm, 21 squamous (SQ) cell carcipoma cases, 20 pulmonary carcinoid 
(COID) tumors and 6 small cell lung cancers (SCLC), as well as 17 normal lung (ML) 
samples. | 

[00101] Tumor and normal lirng specimens in this study were obtained from two 
independent tumor banks. The following specimens were obtained from the Thoracic 
Oncology Tinnor Bank at the Brigham and Women's Hospital / Dana Farber Cancer Institute: 
127 adenocarcinomas, 8 squamous cell carcinomas, 4 small cell carcinomas, and 14 
puhnonary carcinoid samples. In addition 12 adenocarcinoma samples without associated 
clinical data were obtained from the Brigham/Dana-Farber tumor bank. In addition, 13 
squamous cell carcinoma, 2 small cell lung carcinoma^ and 6 carcinoid samples were 
obtained from the Massachusetts General Hospital (MGH) Tumor Bank. The snap-frozen, 
anonymized samples from MGH were not associated with histological sections or clinical 
data. 

[00102] Frozen samples of resected lung tumors and parallel **normal" (grossly 
uninvolved) lung (protocol 91-03831) for anonymous distribution to IRB-approved research 
projects were obtained within 30 minutes of resection and subdivided into samples (--100 
mg). Samples intended for nucleic acid extraction was snap frozen on powdered dry ice and 
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individually stored at -140 °C. Each was associated with an immediately adjacent sample 
embedded for histology in Optimal Cutting Temperature (OCT) medium and stored at -80 
°C. Six micron frozen sections of embedded samples stained Avith H&E was used to confirm 
the post operative-pathologic diagnosis and to estimate the cellular composition of adj acent 
extraction samples as discussed below. Each selected sample was fiirther characterized by 
examining viable tumor ceUs in H&E stained frozen sections comprising of at least 30% 
nucleated cells and low levels of tumor necrosis (<40%). In addition, at least once 
puhnonary pathologists a and II) independently evaluated adjacent OCT blocks for tumor 
type and content. Notes were also taken for extent of fibrosis and inflammatory infiltrates. 
[00103] DupUcate blocks, coupled with the identical OCT-embedded block, were also 
available for 36 of the adenocarcinoma samples. The majority of these duplicate blocks were 
withia 1 to 1.5 cm from one another. 

[00104] CUnical data from a prospective database and from the hospital records 
included the age and sex of the patient, smoking history, type of resection, post-operative 
pathological staging, post-operative histopathological diagnosis, patient survival information, 
time of last follow-up interval or time of death from the date of resection, disease status at 
last follow-up or death (when known), and site of disease recurrence (when known). Code 
numbers were assigned to samples and conelated clinical data. The linkup between the code 
numbers and all patient identifiers was destroyed, rendering the samples and clinical data 
completely anonymous. 

[00105] 125 adenocarcinoma samples were associated with chnical data. 
Adenocarcinoma patients included 53 males and 72 females. There were 17 reported non- 
smokers, 51 patients reporting less than a 40 pack-year smoking history, and 54 patients 
reported a greater than 40 pack-year smoking history. The post-operative surgical- 
pathological staging of these samples included 76 stage I tumors, 24 stage H tumors, 10 stage 
m tumors, and 12 patients with putative metastatic tumors. Note that numbers do not always 
add to 125, as complete information could not be found for each case. 

RNA extraction and Microarray Experiments 

[00106] Briefly, tissue samples were homogenized in Trizol (life Technologies, 
Gaithersburg, MD) and RNA was extracted and purified using the RNEASY column 
purification kit (QIAGEN, Chatsworth, CA). RNA exfracted from samples that were 
collected from two different OCT blocks was given the sanq)le code name followed by the 
corresponding OCT block name. Denaturing formaldehyde gel electrophoresis followed by 
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northern blotting using a beta-actin probe assessed RNA integrity. Samples were excluded if 
beta-actin was not full-length. 

[00107] Preparation of in vitro transcription (IVT) products and oligonucleotide array 
hybridization and scanning were performed according to AfFymetrix protocol (Santa Clara, 
CA). In brief, the amount of starting total RNA for each IVT reaction varied between 15 and 
20 mg. First strand cDNA synthesis was generated using a T74inked oligo-dT primer, 
followed by second strand synthesis. IVT reactions were performed in batches to generate 
cRNA targets containing biottaylated UTP and CTP, which was subsequently chemically 
fragmented at 95 ^'C for 35 minutes. Ten micrograms of the fragmented, biotinylated cRNA 
was mixed with MES buffer (2-[N-Morpholino]ethansulfonic acid) containing 0.5 mg/ml 
acetylated bovine serum albumin (Sigma, St. Louis, MO) and hybridized to Affymetrix 
(Santa Clara, CA) HGU95A v2 arrays at 45 ^C for 16 hours, HGU95A v2 arrays contain 
-12600 genes and expressed sequence tags. Arrays were washed and stained with 
streptavidin-phycoerythrin (SAFE, Molecular Probes). Signal amplification was performed 
using a biotinylated anti-streptavidin antibody (Vector Laboratories, Burlingame, CA) at 3 
|ig/ml. A second staining with SAFE followed this. Normal goat IgG (2 mg/ml) was used as 
a blocking agent. Scans on arrays were performed on Affymetrix scanners and the 
expression value for each gene was calculated using Affymetrix GENECHIP software. 
Minor differences in microarray intensity were corrected using a scaling method as detailed 
below. 

Example 2: Data Analysis 

Feature Selection and Hierarchical Clustering. 

[00108] For Dataset A, a standard deviation threshold of 50 expression units was used 
to select the 3,312 most variable transcript sequences. For Dataset B, 52 pairs of replicates 
(representing 36 duplicate adenocarcinomas) were used to determine the quality of the 
dataset, and 45 pairs having a value > 0.9 were used to select 675 transcript sequences 
(features) whose expression varied the most across all sample pairs (Figs. 3-5). 

Preprocessing and Re-scaling 

[001091 Th® r^iw expression data for the first 12600 genes obtained from Affymetrix 
GENECHIP software was re-scaled to account for different chip intensities. Each column 
(sample) in the dataset was multiplied by 1 /slope of a least squares linear fit of the sample vs. 
the reference (a sample in the dataset). The linear fit was done using only genes that have 
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Presenf caUs in both the sample being re-scaled and the reference. The sample chosen as 
reference was a typical one (i.e. one with the number of "P" calls closer to the average over 
all samples in the dataset). The reference sample for the dataset was AD114T1. Scans were 
rejected if the scahng factor exceeded a factor of 4, fewer than 30% 'Present' calls, or 
microairay artifacts were visible. Scans that failed the above criterion were re-hybridized and 
re-scanned on new chips from the same fragmented cDNA. 

[00110] However, linear scaling was insufficient to correct for non-linear responses 
that were observed, which may have resulted from saturation effects or IVT-variations from 
one batch to the other. Thus, a non-linear scaling was appUed to adjust for such differences 
(Fig. 3). The 2% trimmed mean of "P" genes for aU arrays after hnear and non-hnear rank 
mvariant scaling (described below) are shown in box plots stratified by IVT batches. The 
batch differences in mean intensity may be due to the fact that a more homogenous IVT 
processing was applied to arrays in the same IVT batch than arrays in different batches. Also 
noticeable was the non-hnear relationships between the scatter-plots of repUcate arrays (Fig. 
3) and reference RNA samples (Fig. 4), which justifies non-hnear scahng methods to make 
expression values of genes across arrays more reasonable estimates of the actual expression 
values for transcripts and overall brightness of arrays. 

[00111] A rank-invariant scaling method (Tseng, G. C, Oh,M. K., Rohlin, L., Liao, 
J. C. &Wong,W. H. (2001) M<c/eic^cirfs/tes 29, 2549-57) was used to scale all arrays 
towards a baseline array (ADl 14T1). A set of genes whose ranks m the two arrays was 
smaller than 50 (an empirical value chosen to make the pomts for selected genes naturally 
form a ti^t curve, was used to fit a smoothing sphne(Venables,W. N. & Ripley, B. D. 
(1998) Modem applied statistics with S-PLUS (Springer, Berhn)) m the scatter-plot of the 
array to be normahzed (X-axis) and the basehne array (Y-axis). This 'Invariant Set" 
presumably consists of non-differentially expressed genes. The normahzed values were 
determmed by reading off the values determined by the smoothing curve for values on X- 
axis. After scahng the rephcate arrays agree better, and batch differences were less dramatic 
(Fig. 3). Hence, the rank invariant-scaled data was used for all downstream analysis. 

Reproducibility Statistics 

[00112] ReproducibiUty controls mcluded independent frozen tissue blocks for 36 
adenocarcinomas resected from the lung, 16 rephcates of IVT reactions or scans, and 13 
reference RNA samples (Stratagene, La Jolla, Cahfomia). Scaled expression values for 45 of 
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the 52 replicates compared were correlated with R > 0.9, and for SO of the 52 replicates with 
R >0.85. Examples ofpairwise correlations between rq)Ucates are shown in Fig. 5. 

Replication Filtering 

[001 13] According to the invention, technical noise may affect the measurement of 
some genes more than others, and the already difficult problem of adenocarcinoma sub- 
classification might be particularly sensitive to such noise. Accordingly, adenocarcinoma 
replicates were used to select only highly reproducible features (representing genes) for 
subsequent use in adenocarcinoma clustering. The reproducibility of 52 pairs of repUcate 
arrays randomly selected across the adenocarcinoma samples was assessed. For each pair of 
rephcates, a single measure of correlation (R^) was computed across all 12600 genes (Fig. 5). 
Forty-five replicate pairs with values greater than 0.9 were used for filtering genes 
(below). 

[00114] For each gene, a scatter plot was generated with the selected 45 pairs of 
replicate data points. The reproducibility of expression was assessed (Pearson correlation) 
between replicate pairs as well as the variabihty of expression values across the 45 pairs. The 
distribution of 45 pairwise expression datapoints was plotted for genes that were randomly 
selected. The correlation index of expression (a measure of a gene's variabihty between 
samples). To avoid spurious correlation measures 2-4 outUers in each dimension were 
removed firom the calculation of correlation was obtained (cluster Licl W26626:, cor=0.0221; 
desmoglein 3 (pemphi, cor=0.354; phosphoglucomutase 5, cor=0.31 1; ATP syuthase, H-f tra, 
cor=0.137;Cluster Incl A14316, cor=0.188; Cluster hicl Y12851, cot=0.2631, solute carrier 
famil, cor=0.429; zinc finger protein, cor=0.179; Cluster hicl AA5866, copO.374; Cluster 
Incl AA5866, cor=0.315; Cluster Incl M34428, cor=0.351; ets variant gene 2, cor=0.187; 
RecQ protein-like 5, cor=0.366; Cluster Incl AJOlOO, cop=0.378; one cut domain, fami, 
cor=0.396; hexose-6-phosphate d, cor=0.0165; Cluster Incl AL0223, cor=0.376; synovial 
sarcoma, X, cor=0.371; Cluster Incl S79325, cor=0.502; Cluster Incl Z84717: and 
cor=0.513). In addition, genes whose expression levels did not vary significantly across the 
45 samples were eliminated because they were unlikely to be informative. The number of 
features (genes) selected by this filter varied depending on the Pearson correlation cut-off 
used. A clustering of adenocarcinomas was performed using 675 genes selected by a Pearson 
correlation threshold of 0.8. These genes have consistent expression values between rephcate 
arrays, and their expression across all adenocarcinoma samples was variable. Selection of 
genes at Pearson correlation coefficients of 0.7 (1514 genes), 0.75 (11 05 genes), or 0.85 (366 
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genes) led to roughly similar clustering. The distribution of 45 pairwise egression 
datapoints was plotted for selected genes that varied between the 45 adenocarcinoma 
repUcates. The spread of the datapoints results in a correlation index that can be used to 
select genes that are variant between adenocarcinomas. Gene sets were selected based on 
their correlation cutoffs (0.7, 0.75. 0.8 and 0.85). To avoid spurious correlation measure 2-4 
outUers in each dimension were removed from the calculation of correlation. The expression 
ranges of genes in samples that pass a replicate correlation greater than 0.85 include 
glyceraldehyd&-3-pho, cor=0.873; glycetaldehyde-3-pho, coi=0.861; trefoil factor 3, 
cor=0.966; thymosin, beta 10, cor=0.862; ribosomal protein LB. cor=0.867; iramunoglobuhn 
kappa. cor=0.854; ribosomal protein SI, cor=0.882; melanoma antigen, fe, coi=0.85; 
epithelial protein u, cor=0.889; metallothionem IF (,coi=0.88; surfactant, pulmonar, 
cor=0.921; UDP glycosyltransfer, cor=0.931; melanoma antigen, fe, cor=0.938; 
phosphoUpase A2, gr, cor=0.888; proline oxidase homo, cor=0.871 ; melanoma antigen, fa, 
cor=0.922; ring finger protem, cor 0.91; Cluster Incl AF0151, cor 855; tubulin, alpha, ubiq, 
cor=0.851, and secretory leukocyte, cor=0.934. 

Hierarchical Clustering 

[00115] ffierarchical clustering is an unsiq)ervised learning method useful for dividing 
data into natural groups. Data are clustered hierarchically by organizing the data into a tree 
structure based upon the degree of correlation between features. CLUSTER (Eisen, M. B., 
Spelhnan,P. T., Brown, P. O. &Botstein,D. il99S) Proc Natl Acad Sci US A95,US63- 
8) was used to perform average Unkage clustering of both genes and arrays, using median 
centering and normalization, and the results were displayed using TREEVIEW (Eisen, M. 
B., Spelhnan, P. T., Brown, P. O. & Botstem, D. (1998) ProcNatl Acad Sci USA 95, 
14863-8). This organizes all of the data elements into a single tree with the higher levels of 
the tree representing the discovered classes. A threshold of 0 units was imposed before 
clustering because the negative values may contribute to artifacts. After this preprocessing, a 
set of genes was selected for clustering. For Dataset A, a variation filter was used that 
required a standard deviation greater than or equal to 50 expression units across samples, and 
3,312 genes were selected. More stringent variation filters were selected (as few as 900 
genes), which produced sunilar clustering results. For dataset B, 675 genes were selected 
based on the replicate filtering described above. 

[00116] In summary, a hierarchical clustering was performed on two data sets: Dataset 
A, with 203 samples, and a subset, Dataset B, with 156 samples. Two distinct gene 
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selections were used (3,312 genes selected by standard deviation in Fig. 1 versus 675 genes 
selected by replication filtering. To compare the results of tiiese analyses, the clusters 
defined in the adenocarcinomas ware mapped onto a tree generated using 3,312 genes. 
Clusters C2, C3 and C4 of the adenocarcinomas form consistently in both analyses. 

Probabilistic Clustering 

[00117] In order to validate the taxonomy obtained by hierarchical clustering, a model- 
based probabilistic clustering was also used (Cheeseman, P. & Stutz, J. (1996) in Advances 
in Knowledge Discovery and Data Mining, eds. Fayyad, U. M., Piatetsky-Shapiro, G., 
Smyth, P. & Uthurasamy, R. (MIT Press, Cambridge), Titterington, D. M., Smith, A. F. & 
Makov, U. F. (1985) Statistical Analysis of Finite Mixture Distributions (John Wiley, New 
York)), and the number and composition of clusters obtained by the two methods were 
compared. The specific program used for probabiUstic clustering is AutoClass (Cheeseman, 
P. & Stutz, J. (1996) in Advances in Knowledge Discovery and Data Mining, eds. Fayyad, 
U. M.,Piatetsky-Shapiro,a,Smyth,P. & Uthurasamy, R. (MIT Press, Cambridge). The 
method allows for the automatic selection of the number of clusters, and it performs a soft 
partitioning of the data, whereby each sample can be firactionally assigned to more than one 
cluster, thus reflecting the inherent tmcertainty in the data (in practice, in all experiments 
samples were assigned to a cluster with probability 1). Probabilistic model-based clustering, 
usually refenred to as finite-mixture models (Titterington, D. M., Smith, A. F. &Makov,U. 
F. (1985) Statistical Analysis of Finite Mixture Distributions (John Wiley, New York)), is 
buih on the assumption that the observed data can be partitioned into sub-populations 
(clusters), each govemed by a distinct probability distribution. Since a prion the cluster 
membership is not known, the resulting distribution of the observed data is a mixture of the 
sub-population distributions. Learning, or inducing, the probabilistic model generating the 
observed data thus entails determining the number of clusters {model selection), as well as the 
parameters of the sub-population distributions {parameter estimation). The model selection 
is based on a Bayesian score that measures the posterior probabiUty of the model given the 
observed data. Assuming all models are a priori equally likely, this translates into searching 
for the model that assigns the highest probability to the observed data (i.e which best 
"explams" the data). It should be emphasized that the Bayesian score mcorporates a 
component that penaUzes model complexity (the higher the number of clusters, the higher the 
complexity of the model), thus automatically controlling for ov«:-fitting. The parameter 
estimation for this type of modelling is a combinatorial optimization problem for which an 
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exact solution is computationally infeasible. Therefore, an approximate solution needs to be 
adopted. AutoClass adopts the Expectation-Maximization algorithm (EM), an iterative 
procedure that, starting from a random initiaUzation of the parameters, incrementally adjusts 
fhem in an attempt to find their maximum likelihood estimates (under rather general 
conditions, the procedure is guaranteed to converge to a local maximum) (Dempster, A. P., 
Laird, N. M. & Rubin, D. B. (1977) /iJoya/Sto/Soc 39, 398-409, McLachlan, G. J. & 
Krishnan, T. (1997) The EM Algorithm and Extensions (John Wiley, New York). It is 
important to point out that because of this random component in the estimation procedure, 
different runs of the learning algorithms may yield different results (i.e., different parameters 
- and consequently, different numbers of clusters - may be selected), a variability that is 
accounted for in the experimraital evaluatioiL 

Experimental Evaluation of Probabilistic Clustering 

[00118] A model-based probabilistic clustering was applied to a data set of 156 
samples (Dataset B). For the selection of the genes, the repUcate filtering method was used 
as described above. Two feature sets were used, the first including 675 genes (obtamed by 
setting the correlation threshold at 0.8), and the second including 1514 genes (correlation 
threshold setting of 0.7). The use of differeiit feature sets was aimed at testing for the 
sensitivity of the clustering procedure to the number of genes included. AutoClass was then 
q)plied to the resulting data set. For each feature set, two sets of experiments were run. In 
the first experiment (Experiment 1), the learning algorithms were run 200 times, with the 
only difference between successive runs being in the random initialization of the model 
parameters. The aim of this experiment was to try to account for variability due to the 
approximate nature of the estimation procedure. In the second experiment (Experiment 2), 
the learning algorithms were run 200 times on "bootstrapped" data sets, where a 
bootstrapped data set was obtained by randomly picking, with replacement, 156 samples from 
the original data set. The bootstrapped data set differs from the original one in that some of 
the samples may appear in it multiple times, while other samples may be missing altogether. 
This experiment was aimed at testing for the robustness of the clustramg results to random 
variations in the observed data. Fig. 6 shows the distribution ofthe number of clusters over 
multiple runs for the different settings. As expected, the variability in the number of clusters 
over multiple iterations was higher in Experiment 2 (bootstrapping) than in Experiment 1 
(random restart). This was due to the feet that in a bootstrapped data set, it often happens that 
the same sample is included more than once (on average, over 200 iterations, each bootstrap 
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data set contained about 100 of the 156 samples in the origmal data set. In other words, on 
average 56 samples were duplications of samples already included). If a sample was 
included a sufficient number of times, the clustering algorithm may find it appropriate to 
define a cluster for that sample only, thus artificially inflating the number of clusters. Despite 
this variability, it was reassuring to see that this altemative clustering methodology selected a 
number of clusters mostly varying between 6 and 9, very close to the number of clusters 
selected by hierarchical clustering. 

[00119] A visualization method was used to control for the consistency of the cluster 
composition over multiple runs, as well as to compare the clusters found by AutoClass with 
the ones obtained by hierarchical clustering. A colored matrix that is a color-based rendition 
of a corresponding symmetric matrix whose entries record a normalized measure of how 
often two samples appear in the same cluster across multiple runs. Rows and columns in this 
matrix were indexed by the samples in the data set, thus yielding a 156x156 matrix, with each 
entry taking a real value between 0 and 1. An entry set to 0 (1) indicates that the two samples 
indexing that entry never (always) appear in the same cluster. More specifically, given two 
samples, the corresponding entry in the matrix records the quantity Nmatch/Ntotab where Ntotai is 
the number of iterations in which both samples are included, and Nmatch denotes the number 
of iterations in which the two samples are included and are clustered together. That Ntotai is 
equal to the total number of iterations in Experiment 1, but not in Experiment 2, where it can 
often happen that a sample is not selected at all in a given iteration. 

[00120] Ideally, all entries in the matrix are either 0 or 1, corresponding to the situation 
where the cluster composition remains unchanged over multiple runs of the algorithm. 
Furthermore, if the samples are arranged in the matrix in the order produced by hierarchical 
clustering, a perfect agreement between the two clustering methodologies would translate 
into a block-diagonal matrix with blocks of Ts along the diagonal - each block 
corresponding to a different cluster - surrounded by O's. Two-dimensional matrices were 
generated corresponding, respectively, to Experiment 1 (200 iterations with random restart on 
the original data set) and Experiment 2 (200 iterations on bootstrap data sets) for the 675- 
gene data set. Corresponding two-dimensional matrices were generated for the 1514-gene 
data set. Blocks corresponding to the candidate clusters are clearly distinguishable along the 
diagonal in all four of the two-dimensional matrices, thus providing supporting evidence that 
the selected clusters were imaffected by random variations in the data set. 
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iT-Nearest Neighbor-based Marker Gene Selection and Supervised Learning 

[001211 FoUowing definition of "classes" and their boundaries, a fc-NN algorithm was 
used to choose "marker" genes whose expression best correlated with each class distinction. 
Class definitions were based on clustering. Marker genes were chosen based on the signal- 
to-noise statistic (Mc,«so - Mc,a«i)/(cia«o + dassi), where M and represent the mean and standard 
deviation of expression, respectively, for each class (Golub, T. R., Slonim, D. K., Tamayo, 
P.,Huard,C.,Gaasenbeek,M.,Mesirov,J. P., Coller, H., Loh, M. L., Downing, J. R., 
CaUgiuri, M. A., et al. (1999) Science 286, 531-7). 

[001221 As a fiirther test of the relative robustaess of the sample clusters, a supervised 
classifier was built using the foUowiug metiiodology. Following maricer gene selection, a 
classifier was built and evaluated through leave-one-out cross-vaUdation. For each round of 
cross-validation, one sample was withheld and tiie remaining samples were used to buUd a 
"k-m" classifier (see below), firom which class membership of the withheld sanqjle was 
predicted. The top 25 genes selected by signal-to-noise metric for each class are shown in 
Table 9. 

[001231 A weighted implementation of the A:-NN algorithm that predicts the class of a 
new sample by selecting the calculating the Euclidean distance (d) of this sample to the k 
"nearest neighbor" samples in "expression" space in the training set was used, and tiie 
predicted class was selected to be that of the majority of the k samples (Dasarathy, V. B. 
(1991), (ffiEE Computer Society Press, Los Alamitos, CaUf.)). A maricer gene selection 
process was performed by feeding the fc-NN algorithm only the features witii higher 
correlation with the target class. In this version of the algorithm tiie weight of each of the k 
neighbors was weighted according to 1/d. 

[001241 The cross-vaUdation step was repeated for each sample and the errors were 
talHed. A random 8-class classifier would be expected to give an error rate of 100-(100/8), or 
87.5%. For the initial vaUdation of clusters, classifiers were built with various numbers of 
marker genes selected fiom the 675-gene set that was used for hierarchical clustering. The 
best model used 100 genes (13 % overall error); however, models using 75-200 genes 
performed with less than 20% overall error. 

[001251 For testing whether the cluster definitions were highly dependent on tiie 675- 
gene set, classifiers were bmlt firom tiie remaining 11, 925 genes. The genes were passed 
tiirough a variation filter and marker genes were selected as above. A 100-gene model gave 
an overall error rate of 26%, witii tiie classes tiiat represent clusters performing better than tiie 
"other" class. 
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Kaplan-Meier Analysis and Permntation Testing. 

[00126] Kaplan-Meier curves were generated using standard functions in S-PLUS 
package (Venables, W. N. & Ripley, B. D. (199S) Modem applied statistics with S-PLUS 
(Springer, Berlin)). Only 125 adenocarcinoma samples were used with survival information 
from adenocarcinoma samples. For each cluster, survival within-clusters was compared to 
the out-of-cluster group using the two-sample comparison based on the corresponding two K- 
M curves. In this way 5 K-M plots was obtained for each cluster, of which two plots have 
significant P-values for the comparison of the two curves, namely cluster 2 (C2, P =0.00476) 
and cluster 4 (C4, P=0.049). A similar analysis performed for stage I patient samples was 
statistically non-significant for all clusters. The small sample size (n=4) is a possible factor 
in the non-significance of the result for Stage I C2 patients. 

[00127] These apparently significant P-values have a bias because of multiple 
hypothesis testing. To test for this selection bias, the cluster labels were randomly permuted 
among the samples and K-M significance, for each cluster, the within-cluster and out-of- 
cluster K-M ciuves and the corresponding P-values were re-computed. This randomization 
was repeated 1000 times. The 1000 sets of P-values were used to construct the null 
distributions for the test statistic Tl= the smallest P-value among 5 clusters. From the 1000 
permutations, the P-values for Tl = 0.044. This P-value is a reasonable assessment of the 
significance of outcome differences for the cluster C2 (Fig. 1). This statistical evidence 
supports the predictive value of C2 on survival. 

Example 3: Gene markers for different lung cancers and adenocarcinoma sub-classes 
[00128] Expression data were preprocessed by setting a minimal level of 10 units and 
only genes that showed 5-fold change across the data set were analyzed further. Genes 
correlated with a particular cluster labels (e.g. "cO" or "colon") were identified by sorting all 
of the genes on the array according the signal-to-noise statistic (mu__cO - mu_others)/(sd_cO + 
sd_others), where mu and sd represent the mean and standard deviation of expression, 
respectively, for each class. 

[00129] Permutation of the column (sample) labels was performed to compare these 
correlations to what would be expected by chance. The top signal-to-noise scores for top 
marker genes were compared and compared with the corresponding ones for random 
permutation version of the cluster labels. 1000 random permutations were used to build 
histograms for the top marker, the second best, etc. Based on this histogram the 0. 1 % 
significance levels were estimated as compared with the values obtained for the real dataset. 
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This test helps to assess the statistical significance of gene markers in terms of target class- 
correlations. 

[00130] Included in the Ust of genes are those that exceed the 0.1% significance level 
for each cluster. For those clusters (colon, normal, C4) for which the Usts are very long, only 
the top 200 genes are shown. The following Tables 1-8 present genes for the C1-C4 
subclasses, normal, colorectal metastases, CO, and other subclasses. (The s2n_obs is the 
observed signal to noise value; the non_.norm_list is the Affymetrix reference identifier; the 
LL_num is the LocusLmk identifier; and Desc is the description of the gene or gene product. 
Table 1: CI Markers 

[00131] According to the invention, preferred markers are markers 1-30, preferably 1- 
20, and more preferably 1-10. 
Class CI 

s2n^obsPerm non_normJist GB/TIGR UNIGENE LL^nu Desc 



0.1% 

1.29 1.024 36457_at 

1.25 0.865 40117_at 

1.22 0.797 37337_at 



Identifier (as of 

summer 
2001) 

U10860 Hs.5398 



m (unigene/locuslink 
or affy) 



8833 guanine 

monphosphate 
synthetase 

D84557 Hs.155462 4175 minichromosome 

maintenance 
deficient (mis5, S. 
pombe) 6 

AI803447 Hs.77496 6637 small nuclear 



ribonucleoprotein 
polypeptide G 

1.18 0.770 1055_g at M87339 Hs.35120 5984 replication factor C 



1.18 0.767 41547_at 



(activator 1) 4 
(37kD) 

AF047472 Hs.40323 9184 BUB3 (budding 



6 
7 



uninhibited by 
benzimidazoles 3, 
yeast) homolog 

1 17 0.763 38840^s_at L10678 Hs.91747 5217 profihn2 

X62534 Hs.80684 3148 high-mobiUty 



1.12 0.757 38065_at 



8 1.11 0.754 709_at 

9 1.1 0.739 41583„at 



group (nonhistone 
chromosomal) 
protein 2 

J00314 Hs.336780 7280 tubulin, beta 



AC004770 Hs.4756 



polypeptide 
2237 flap structure- 
specific 
endonuclease 1 
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s2n obs Perm 


non_nonn_list 


GB/TIGR 


UNIGENE 


LL_nu 


Desc 






0.1% 




Identifier 


(as of 


m 


(unigene/locuslink 












summer 




oraffy) 












2001) 






10 


l.Oo 


0.731 


40195_at 


X14850 


Hs. 147097 


3014 


H2A histone 
















family, member X 


11 


1.05 


0.728 


39109_at 


AB024704 


Hs.9329 


22974 


chromosome 20 
















open readmg n:ame 
1 


12 


1.05 


0.727 


207_at 


M86752 


Hs.75612 


10963 


stress-mduced- 
















phosphoprotein 1 
















(Hsp70/Hsp90- 
















orgamzmg protem) 


13 


1.05 


0.722 


1884_s_at 


M15796 


Hs.78996 


5111 


proliferating cell 
















nuclear antigen 


14 


1.04 


0.716 


34763_at 


AF020043 


Hs.24485 


9126 


chondroitin sulfate 
















proteoglycan 6 
















(bamacan) 


15 


1.02 


0.715 


40619_at 


M91670 


Hs. 174070 


27338 


ubiquitin carrier 
















protein 


16. 


1.01 


0.715 


1824_s_at 


J05614 






proliferating cell 
















nuclear antigen 
















(PCNA) 


17 


1.01 


0.714 


572_at 


M86699 


Hs.169840 


nil 


TTK protein 
















kinase 


18 


1 


0.711 


151_s_at 


V00599 


Hs.179661 


2280 


V00599 
















f^^"^^ A flip T 

/FEATURE=mRN 
















A 

/DEFINrnON=HS 
















TUB2 Human 
















mRNA fragment 
















encoding beta- 
















tubulin. (from 
















clone D-beta-1) 


19 


1 


0.708 


1803_at 


X05360 


Hs. 184572 


983 


cell division cycle 
















2, Gl to S and G2 
















toM 


20 


0.99 


0.706 


1515_at 


HG4074- 






Rad2 










HT4344 








21 


0.98 


0.704 


34791_at 


X52882 


Hs.4112 


6950 


t-complex 1 


22 


0.97 


0.702 


40690_at 


X54942 


Hs.83758 


1164 


CDC28 protein 
















kinase 2 


23 


0.96 


0.700 


40697_at 


X51688 


Hs.85137 


890 


cyclin A2 


24 


0.96 


0.696 


37686_s_at 


Y09008 


Hs.78853 


7374 


uracil-DNA 
















glycosylase 


25 


0.96 


0.693 


982_at 


X74795 


Hs.77171 


4174 


minichromosome 
















maintenance 
















deficient (S. 
















cerevisiae) 5 (cell 
















division cycle 46) 
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s2n_obs Penn non_nonn_list 
0.1% 



26 0.95 0.692 1505_at 

27 0.94 0.690 38992_at 

28 0.94 0.690 33255_at 

29 0,94 0.688 36813_a± 

30 0.93 0.684 34882_at 



31 0.91 

32 0.9 



0.684 34715_at 
0.683 674 g at 



33 0.9 0.680 39337_at 

34 0.89 0.679 41756_at 

35 0.89 0.678 40417_at 

36 0.89 0.677 571_at 

37 0.89 0.676 38804_at 

38 0.88 0.675 37304_at 

39 0.88 0.674 34383_at 



GB/TIGR 
Identifier 



D00596 
X64229 
M97856 



UNIGENE 
(as of 
summer 
2001) 

Hs.82962 

Hs.110713 
Hs.243886 



LL_nu Desc 

m (umgene/locusliiik 
oraffy) 



U96131 Hs.6566 



U74612 Hs.239 
J04031 Hs. 172665 



M37583 Hs.l 19192 

AJ010842 Hs.18259 

D43950 

M86667 Hs. 179662 

AF053641 Hs.90073 

U35451 Hs.77254 



7298 
7913 
4678 

9319 



Y12065 Hs.296585 10528 



2305 
4522 



3015 
11321 



4673 
1434 

10951 



AB014458 Hs.35086 7398 



thymidylate 
synthetase 
DEK oncogene 
(DN A binding) 
nuclear 
autoantigenic 
spemi protein 
^stone-binding) 
thyroid hormone 
receptor interactor 
13 

nucleolar protein 
(KKE/D repeat) 
forkheadboxMl 
methylenetetrahydr 
ofolate 

dehydrogenase 

(NADP+ 

dependent), 

methenyltetrahydr 

ofolate 

cyclohydrolase, 
formyltetrahydrofo 
late synthetase 
KZAhistone 
family, memba: Z 
XPA binding 
protein 1; putative 
ATP(GTP)- 
binding protein 
chaperonin 
containing TCPl, 
subunit 5 (epsilon) 
nucleosome 
assembly protein 
1-likel 
chromosome 
segregation 1 
(yeast homolog)- 
Uke 

chromobox 
homolog 1 
(DrosophilaHPl 
beta) 

ubiquitin specific 
protease 1 
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s2n obs Perm 


iion_noini_list 


Ltd/ 1 1(jK 


UJNlvJlliNJl 


T T vm 


Desc 






0.1% 




Identifier 


(as of 
summer 

zUUi ) 


m 


(unigene/locuslink 
or afify) 


40 


0.87 


0.674 


2003_s_at 


U28946 


Hs.3248 


2956 


mutS (E. coli) 
homolog 6 


41 


U.o / 


u.o/ J 


40407_at 


U28386 


Hs.159557 


3838 


karyopherin alpha 

9 n? A n r»AViArf 1 

importin alpha 1) 


42 


u.o / 


U.O/Z 


40041_at 


AF017790 


Hs.58169 


10403 


highly expressed in 
caiicer^ iicii m 
leucine heptad 
repeats 


43 


U.oj 


U.OOO 


41375_at 


AJ245416 


Hs.103106 


57819 


U6 snRNA- 
associated Sm-like 
protein 


44 


0.85 


0.666 


1985_s_at 


X73066 


Hs. 118638 


4830 


non-metastatic 
cells 1, protein 
(NM23A) 
expressed in 


45 


0.85 


0.664 


36987_at 


M94362 


Hs.334709 


3999 


lamin B2 


46 


0.84 


0.663 


1782_s_at 


M31303 


Hs.81915 


3925 


leukemia- 
associated 
phosphoprotein 
pI8 (statnmm) 


47 


0.84 


0.659 


35699_at 


AF053306 


Hs.36708 


701 


budding 
uninhibited by 
benzimidazoles 1 
(^yeasi nomoiogj, 
beta 


48 


u.o4 


A /ZCO 


38414_at 


U05340 


Hs. 82906 


AA1 

991 


CDCzU (cell 
division cycle 20, 
S. cerevisiae, 
homolog) 


49 


A OA 

U.54 


U.O J / 


35218_at 


AF022385 


Hs.28866 


11235 


progranamed cell 
aeatn lu 


50 


0.84 


0.656 


40726 at 


U37426 


Hs.8878 


3832 


kinesin-like 1 


51 


0.83 


0.653 


1136_at 


L16991 


Hs.79006 


1841 


deoxythymidylate 

kinase 

(thymidylate 

kinase) 


52 




U.O^Z 


36Q98_at 


M72709 


as. lilsil 


04Z0 


splicing factor, 

drgllilll w SdlilC" 

rich 1 (splicing 
factor 2, alternate 
splicing factor) 


53 


0.83 


0.650 


38350 f at 


AF005392 


Hs.98102 


7278 


tubulin, alpha 2 


54 


0.83 


0.649 


39374_at 


AL022325 


Hs.122552 


51512 


hypothetical 



protein FU10140 
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s2n_obs Perm non_nonn_list GB/TIGR 
0.1% Identifier 



55 0.83 0.649 34314_at 



56 0.83 0.648 38473_at 



57 0.83 

58 0.83 



65 0.81 



0.647 1945_at 
0.646 37347 at 



X59543 



M63180 

M25753 
AA926959 



UNIGENE 
(as of 
summer 
2001) 

Hs.2934 



Hs.84131 

Hs.23960 
Hs.77550 



LL_nu Desc 

m (unigene/locuslink 
orafEy) 



60 0.82 0.645 41342_at D38076 

61 0.82 0.645 860_at U03911 



62 0.82 0.643 41569_at 

63 0.82 0.642 32610_at 

64 0.81 0.639 33247_at 



0.638 32530 at 



66 0.81 0.638 1854 at 



68 0.8 0.637 318_at 

69 0.8 0.636 418 at 



AI680675 
X93510 

U86782 
X56468 



X13293 



67 0.81 0.637 37333_at X63692 



D64142 
X65550 



70 0.8 0.635 38116_at D14657 



Hs.24763 
Hs.78934 



Hs.44131 
Hs.79691 

Hs.178761 



Hs.74405 



Hs.77462 

Hs.109804 
Hs.80976 

Hs.81892 



6240 



6897 

891 
84722 



59 0.82 0.645 40587_s_at AF054186 Hs.298581 9521 



5902 
4436 



23234 
8572 

10213 
10971 



Hs.179718 4605 



1786 

8971 
4288 

9768 



ribonucleotide 
reductase Ml 
polypeptide 
threonyl-tRNA 
synthetase 
cyclinBl 
hypothetical 
protein MGC1780 
eukaryotic 
translation 
elongation factor 1 
epsilon 1 
RAN binding 
protein 1 
mutS (E. coli) 
homolog 2 (colon 
cancer, 

nonpolyposis type 
1) 

KIAA0974 protein 
LIM domain 
protein 

26S proteasome- 
associated padl 
homolog 
tyrosine 3- 
monooxygenase/tr 
yptophan 5- 
monooxygenase 
activation protein, 
theta polypeptide 
v-myb avian 
myeloblastosis 
viral oncogene 
homolog-like2 
DNA (cytosine-5- 
)-methyltransferase 
1 

HI histone &mily, 
member X 
antigen identified 
by monoclonal 
antibody Ki-67 
KIAAOlOl gene 
product 
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s2n_obs Perm non_nonn_list GB/TIGR 
0.1% Ideatifier 



71 0.8 0.634 40638 at X70944 



72 0.8 0.633 36913_at 

73 0.79 0.631 36171_at 

74 0.79 0.631 3825 l_at 

75 0.79 0.631 32214_at 

76 0.79 0.630 35312 at 



77 0.79 0.630 35995_at 

78 0.79 0.626 39677_at 

79 0.78 0.624 38031_at 

80 0.78 0.624 34327 at 



UNIGENE 
(as of 
summer 
2001) 

Hs.180610 



U75679 
AI521453 



Hs.75257 
Hs.74861 



AI127424 Hs.90318 



AF003938 
D21063 



Hs.18792 
Hs.57101 



AF067656 
D80008 

D21853 

Z46606 



Hs.42650 
Hs.36232 

Hs.79768 



81 0.78 0.623 41322 s at AI816034 Hs.23990 



82 0.78 0.622 36941 at U16954 Hs.75823 



83 0.78 0.621 37228 at U01038 Hs.77597 



LL_nu Desc 

m (umgene/locusliok 
oraffy) 

6421 splicing factor 

proline/glutamine 
rich 

(polypyrimidine 

tract-binding 

protein-associated) 

7884 Hairpin binding 
protein, histone 

10923 activated RNA 
polymerase n 
transcription 
cofactor 4 

4632 myosin, light 
polypeptide 1, 
alkali; skeletal, fast 

9352 thioredoxin-like, 
32kD 

4171 minichromosome 

maintenance 
deficient (S. 
cerevisiae) 2 
(mitotin) 

11130 ZWlOinteractor 

9837 KIAA0186 gene 
product 

9775 KIAAOlUgene 
product 
HLTF gene for 
heUcase-like 
transcription factor 
/cds=UNKNOWN 
/gb=Z46606 
/gi=575250 
/ug=Hs.3068 
/len=5439 

5565 1 nucleolar protein 
family A, member 
2 (H/AC A small 
nucleolar RNPs) 

10962 ALLl-fused gene 
from chromosome 

Iq 

5347 polo (Drosophia)- 
like kinase 
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s2n_obs Penn non_nonn_list 
0.1% 



84 0.78 0.620 140 s at 



85 0.77 0.620 149 at 



86 0.77 

87 0.77 



0.620 349 g at 
0.619 1599 at 



88 0.77 0.619 39056 at 



89 0.77 0.618 32594 at 



90 0.77 0.618 37985_at 

91 0.77 0.618 584 s at 



92 0.77 0.618 34659_at 

93 0.77 0.616 39812 at 



94 0.77 0.615 41403 at 



95 0.76 0.615 33252 at 



GB/nGR 
Idoitifier 



U68063 



UNIGENE 
(as of 
summer 
2001) 

Hs.30035 



U90426 



D14678 
L25876 



Hs.20830 
Hs.84113 



X53793 



L37747 
M30938 



Hs.84981 



AB018334 
X79865 



Hs.23255 
Hs.109059 



LL_nu Desc 

m (unigene/locusliiik 
or affy) 



D38073 



6434 



Hs.179606 10212 



3833 
1033 



Hs.l 17950 10606 



AF026291 Hs.79150 10575 



7520 



9631 
6182 



AI032612 Hs.105465 6636 



Hs.179565 4172 



spUcing factor, 
arginiae/serine- 
rich (transformer 2 
Drosophila 
homolog) 10 
nuclear RNA 
helicase, DECD 
variant of DEAD 
box family 
kdnesin-like 2 
cyclin-depradent 
kinase inhibitor 3 
(CDK2-associated 
dual specificity 
phosphatase) 
multifunctional 
polypeptide similar 
to SAICAR 
synthetase and 
AIR carboxylase 
chaperonin 
containing TCPl, 
subunit 4 (delta) 
lamtnBl 
X-ray repair 
complementing 
defective repair in 
Chinese hamster 
cells 5 (double- 
strand-break 
rejoining; Ku 
autoantigen, 80kD) 
nucleoporin 155kD 
mitochondrial 
ribosomal protein 
L12 

small nuclear 
ribonucleoprotein 
polypeptide F 
minicluomosome 
maintenance 
defici^t (S. 
cerevisiae) 3 
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T T "Mil 


Desc 






V. 1 /O 




laentiner 


(as of 


m 


(unigene/Iocusliiik 












sunuuer 




or affy) 












2001) 






96 


U, /o 


U.Oi4 


37738_g_at 


D25547 


Hs.79137 


5110 


X • T 

protein-L- 
















isoaspaitate (D- 
















aspartate) O- 
















methyltransferase 


97 


0.76 


0.614 


35916_s_at 


AA877215 






cDNA, 3 end 


98 


0.75 


0.613' 


32843_s_at 


M30448 






casein kinase 2, 
















beta polypeptide 


99 






1674_at 


Ml 5990 


Hs. 194148 


7525 


v-yes-1 
















Yamaguchi 
















sarcoma viral 
















oncogene homolog 
1 


100 


0.74 


0.611 


40842_at 


M60784 






small nuclear 
















nbonucleoprotem 
















1 X* J A 

polypeptide A 


101 


U. /4 


U.olU 


38847_at 


D79997 


Hs. 184339 


9833 


KIAA0175 gene 
















product 




0.74 


A ZTAA 

0.609 


iy9oj_at 


A1d7U57z 


J1S.45UU2 


coo 1 
35ol 


ras-relatecl C3 
















1 x^ 1 • X • 

botulmirai toxm 
















substrate 3 (rho 
















family, small GTP 
















binding protein 
















Rac3) 


103 


A '7/1 


A /^AO 


351_i_at 


D28423 






T1"X.T A 

pre-mRNA 
















splicing factor 
















SRp20, 5 UTR 


1U4 


A T3 
U.73 


U.OU/ 


iolj5_at 


T TOiC/CAO 

UoooOz 


J1S.744U7 




nucleolar protein 
















p40; homolog of 
















_ X n I^lL T A 1 

yeast EBNAl- 
















binding protein 


105 


A '70 

0.73 


0.607 


39076_s_at 


AI991040 


Hs.334879 


10589 


DRl -associated 
















protein 1 (negative 
















cofactor 2 alpha) 


106 


0.73 


0.606 


34878_at 


AB019987 


Hs.50758 


10051 


SMC4 (structural 
















X r* 

mamtenance of 
















chromosomes 4, 
















yeast)-like 1 


107 


A 11 


A /:^Ayl 
U.OU4 


41855_at 


AF030424 


Hs. 13340 


O C*>A 

8520 


histone 
















acetyltransferase 1 


108 


0.73 


0.604 


38792 at 


AD001528 


Hs.89718 


6611 


• x1 

spermme synthase 


109 


A 70 
U. /Z 


A AAO 

u.ouz 


38123_at 


D14878 


Hs.82043 


8872 


D123 gene product 


1 1 n 
1 lU 


0.72 


0.602 


4U14j_at 




TT_ 1 C/CO 

rlS.i!)o34o 




topoisomerase 
















(DNA) n alpha 
















/I '7AlrT^^ 
(1 /Uku) 


111 


0.72 


0.601 


39262_at 


U79266 


Hs.23642 


29901 


protein predicted 
















by clone 23627 
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(unigene/locusliDk 












suimner 




oraffy) 












2001) 






112 


0.72 


0.600 


36107_at 


AA845575 


Hs.73851 


522 


ATP synthase, H+ 
















transporting. 
















nutocnondnal rU 
















complex, subunit 
















F6 


113 


0.72 


0.599 


37305_at 


U61145 


Hs.77256 


2146 


enhancer of zeste 
















(Drosophila) 














30968 


homolog 2 


114 


0.72 ' 


0.599 


34380 at 


AC004472 


Hs.3439 


stomatin-like 2 


115 


0.72 


0.599 


276_at 


L08069 


Hs.94 


3301 


heat shock protem. 
















DNAJ-like 2 


116 


0.72 


0.599 


34795_at 


U84573 


Hs.41270 


5352 


procollagen-lysine. 
















2-oxoglutarate 5- 
















dioxygenase 
















(lysine 
















hydroxylase) 2 


117 


0.71 


0.599 


39969_at 


AA255502 


Hs.46423 


8364 


H4 histone family, 
















member G 


118 


0.71 


0.599 


32844_at 


AF104913 


Hs.211568 


1981 


eukaryotic 
















translation 
















initiation factor 4 
















gamma, 1 


119 


0.71 


0.599 


41407_at 


L03411 


Hs. 106061 


7936 


RD RNA-bmding 
















protein 


120 


0.71 


0.598 


39759_at 


AL031781 


Hs.15020 


9444 


homolog of mouse 
















quakmg QKI (KH 
















domain RNA 
















binding protein) 


121 


0.71 


0.598 


35364 at 


U50939 


Hs.61828 


8883 


amyloid beta 
















precursor protein- 
















binding protein 1, 
















59kD 


122 


0.71 


0.598 


36812_at 


U92715 


Hs.6564 


8412 


breast cancer anti- 
















estrogen resistance 
3 


123 


0.71 


0.598 


36837_at 


U63743 


Hs.69360 


11004 


kinesin-like 6 
















(mitotic 
















centromere- 
















associated kinesin) 


124 


0.71 


0.597 


471 fat 


U47634 


Hs.159154 


10381 


tubulin, beta, 4 


125 


0.71 


0.597 


40879 at 


AB014599 


Hs.330988 


23299 


KIAA0699 protein 


126 


0.71 


0.596 


947 at 


D55716 


Hs.77152 


4176 


minichromosome 
















maintenance 



deficient (S. 
cerevisiae) 7 



45 

BNSDOCID: <W O 03Q29g73A2 I > 



' o 

wo 03/029273 PCT/US02/30797 



s2n_obs Perm non_norm_list GB/TIGR 
0.1% Identifier 



127 0.71 0.595 157 at 



129 0.7 



130 0.7 

131 0.7 

132 0.7 



133 0.7 

134 0.7 

135 0.7 



0.592 32194 at 



0.592 39173_at 
0.590 1840_g_at 

0.588 37739 at 



0.587 34510_at 
0.585 36536_at 

0.583 36863 at 



U65011 



128 0.7 0.593 35200 at X92518 



M37197 



X56597 
HG1112- 
HT1112 
M86737 



AF070552 
AF070614 



136 0.69 0.583 34790 at S70154 



UNIGENE 
(as of 
summer 
2001) 
Hs.30743 



Hs.2726 



Hs.184760 10153 

Hs.99853 

Hs.79162 

Hs. 122908 
Hs.61490 



AF032862 Hs.72550 



Hs.278544 



137 0.69 0.583 527_at U14518 

138 0.69 0.581 38679 g at AA733050 

139 0.69 0.581 39984_j_at U73704 

140 0.68 0.581 40610 at AI743507 



Hs.1594 
Hs.1066 

Hs.49105 
Hs.173518 



141 0.68 0.581 39792 at AF000364 Hs.l5265 



142 0.68 0.579 33266 at AF015254 Hs.l80655 



LL_nu Desc 

m (imigene/locuslmk 
oraEEy) 

23532 preferentially 

expressed antigen 
in melanoma 
8091 high-mobility 

group (nonhistone 
chromosomal) 
protein isoform I-C 
CCAAT-box- 
binding 

transcription fector 
2091 fibrillarin 

Ras-Like Protein 
Tc4 

6749 structure specific 
recognition protein 
1 

81620 DNA replication 

factor 
29970 schwannomin 

interacting protein 
1 

3161 hyaluronan- 

mediated motility 

receptor 

(RHAMM) 
39 acetyl-Coenzyme 

A acetyltransferase 

2 (acetoacetyl 

Coenzyme A 

thiolase) 
1058 centromere protein 

A(17kD) 
6635 small nuclear 

ribonucleoprotein 

polypeptide E 
11146 HKBP-associated 

protein 
51663 likely ortholog of 

mouse zinc finger 

protein Zfi: 
10236 heterogeneous 

nuclear 

ribonucleoprotein 
R 

9212 serine/threonine 
kinase 12 
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143 0.68 0.578 31858_at X07315 

144 0.68 0.578 32340_s_at M85234 



145 0.68 0.577 34099_f_at W26056 

146 0.68 0.577 831_at U28042 



147 0.68 0.576 37945_at U91316 

148 0.68 0.576 33035_at AL021397 

149 0.68 0.575 32120_at AF063308 

150 0.68 0.575 36104_at 

151 0.67 0.575 32548_at L24804 

152 0.67 0.574 36872_at AL120559 



UNIGENE 
(as of 
summer 
2001) 
Hs.151734 



Hs.74497 



Hs.343569 
Hs.41706 



Hs.8679 

Hs.137576 
Hs.16244 



LL_nu Desc 

m (unigetieAocusliiik 
orafiy) 



AA526497 Hs.73818 

Hs.278270 
Hs.7351 



10204 
4904 



26514 
10615 



153 


0.67 


0.573 


38634_at 


M11433 


Hs. 101 850 


5947 


154 


0.67 


0.573 


37683_at 


D80012 


Hs.78829 


9100 


155 


0.67 


0.573 


33127_at 


U89942 


Hs.83354 


4017 


156 


0.67 


0.572 


41401_at 


U57646 


Hs.10526 


1466 


157 


0.67 


0.572 


40074_at 


X16396 


Hs.154672 


10797 



nuclear transport 
factor 2 (placental 
protein 15) 
nuclease sensitive 
element binding 
protein 1 
cDNA 
1662 DEAD/H(Asp- 
Glu-Ala-Asp/His) 
box polypeptide 10 
(RNAhelicase) 
11332 cytosolic acyl 
coenzyme A 
thioester hydrolase 
ribosomal protein 
L34 pseudogene 1 
mitotic spindle 
coiled-coil related 
protein 
7388 ubiquinol- 

cytochrome c 
reductase hinge 
protein 
unactive 
progesterone 
receptor, 23 kD 
cyclic AMP 
phosphoprotein, 19 
kD 

retinol-binding 
protein 1, cellular 
ubiquitin specific 
protease 10 
lysyl oxidase-like 
2 

cysteine and 
glycine-rich 
protein 2 
methylene 
tetrahydrofolate 
dehydrogenase 
(NAD+ 
dependent), 
methenyltetrahydr 
ofolate 

cyclohydrolase 



10728 
10776 
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158 0.66 0.572 41600 at U59435 



159 0.66 0.571 1449 at 



168 0.64 

169 0.64 

170 0.64 



D00763 



UNIGENE 
(as of 
summer 
2001) 

Hs.5181 



Hs.251531 5685 



160 0.66 0.570 37046 at AI246726 Hs.76913 



161 0.66 0.570 34814 at AL041443 Hs.4311 



162 0.66 0.570 32615_at J05032 

163 0.66 0.569 39086 g at AA768912 

164 0.65 0.569 39747 at U52427 



165 0.65 0.568 39009_at N98670 

166 0.65 0.568 40124_at Y18418 

167 0.65 0.568 32730 at AL080059 



0.567 38662_at 
0.567 33679_f_at 
0.567 37302 at 



172 0.64 0.565 131 at 



AL047596 

X02344 

U30872 



Hs.80758 
Hs.923 

Hs. 14839 

Hs.272822 
Hs.173094 



Hs.306117 
Hs.251653 
Hs.77204 



171 0.64 0.566 39704 s at L17131 



Hs.139800 3159 



X83928 



Hs.83126 



LL_mi Desc 

m (iinigene/locuslink 
or affy) 

5036 proliferation- 
associated 2G4, 
381cD 

proteasome 
(prosome, 
macropain) 
subnnit, alpha 
type, 4 
5686 proteasome 
(prosome, 
macropain) 
subvmit, alpha 
type, 5 
10054 SUMO-1 

activating enzyme 
submiit 2 
1615 aspartyl-tRNA 

synthetase 
6742 single-stranded 
DNA-binding 
protein 1 
5436 polymerase (RNA) 
n (DNA directed) 
polypeptide G 
cDNA, 5 end 
8607 RuvB (E coli 

homolog)-like 1 
85453 Homo sapiens 
mRNAfor 
KIAA1750 
protein, partial cds 
23152 KIAA0306 protein 
10383 tubulin, beta, 2 
1063 centromere protein 
F (350/400kD, 
mitosin) 
high-mobility 
group (nonhistone 
chromosomal) 
protein isoforms I 
andY 

6882 TATA box binding 
protein (TBP)- 
associated factor, 
RNA polymerase 
II,I,28kD 
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173 0.64 0.565 40779_at 



174 0.64 0.564 38114_at 

175 0.64 0.564 32850_at 

176 0.64 0.564 1250_at 



177 0.64 

178 0.64 

179 0.64 

180 0.64 



0.564 37345_at 

0.563 37293_at 

0.563 40418_at 

0.562 38158_at 



181 0.64 0.562 910_at 

182 0.64 0.562 35314_at 



183 0.64 0.561 41601_at 



184 0.63 0.561 41824_at 

185 0.63 0.560 36184_at 



GB/TIGR UNIGENE LL_nu Desc 
Identifier (as of m (unigeae/locuslink 

summer or afiy) 

2001) 

U59919 Hs.171374 22920 smgGDS- 

ASSOCIATED 
PROTEIN 

D38551 Hs.81848 5885 RAD21 (S. 

pombe) homolog 
Z25535 Hs.211608 9972 nucleoporin 153kD 
U47077 Hs.155637 5591 protein kinase, 

DNA-activated, 
catalytic 
polypeptide 

AF013759 Hs.7753 813 calumenin 

D43948 Hs.76989 9793 KIAA0097 gene 

product 

X74262 Hs.16003 5928 retinoblastoma- 

binding protein 4 

D79987 Hs.153479 9700 extra spindle poles, 

S. cerevisiae, 
homolog of 

M15205 Hs.105097 7083 thymidine kinase 

1, soluble 

D63880 Hs.5719 9918 chromosome 

condensation- 
related SMC- 
associated protein 
1 

6868 a disintegrin and 
metalloproteinase 
domain 17 (tumor 
necrosis factor, 
alpha, converting 
enzyme) 

AI140114 Hs.6153 51096 CGI-48 protein 
L06419 Hs.75093 5351 procollagen-lysine, 

2-oxoglutarate 5- 
dioxygenase 
(lysine 
hydroxylase, 
Ehlers-Danlos 
syndrome type VI) 
Ras-GTPase- 
activating protein 
SH3-domain- 
binding protein 



AA142964 Hs.64311 



186 0.63 0.560 41133_at U32519 Hs.220689 10146 
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(luugene/locuslink 












summer 




or aflfy) 












O AAl \ 

2001) 






lo7 


0.63 


A CCA 

0.559 


i5o94_at 


ADAI AC01 

Ad0145o/ 


TT-, i/roo 


y44o 


miiogen-aCLivaxeu 
















protein kinase 
















Kmase lunase 
















kinase 4 




U.DO 




39070_at 


T TAI AC? 

U03057 


TT„ 1 1 0>1AA 

ilS.llo400 


O024 


singed 
















(urosopniia ^-luce 
















(^sea urcmn lascui 
















homolog like) 


1 on 


A /CO 


A CCA 

0.359 


1 0A1 

ioUi_at 


U/OOJo 


TT_ C/IAQA 


^QA 


oxCL^/Vi associaiea 
















KUNijr uomain i 


lyu 


0.63 


0.557 


Jo405_at 


TT'lCI ^C 


TT- 0071 O 


oUo / 


jxagiie A menial 
















retardation. 
















autosomal 
















homolog 1 


1 Q1 


A /CO 

U.oi 


A CC7 

0.55/ 


oooo4_ai 


A TA1 AO<;'5 


llS.lUO / /O 


L IKiD^ 


















transporting, type 
















z\^, memoer i 


192 


0.63 


0.554 


31b32_at 


AJt>00oo24 


TT_ 1 /|A1 O 

rlS. 14912 


0*3 'J A< 


iviiVA.uzoo proiem 


193 


0.63 


0.554 


410_s_at 


X57152 


Hs.165843 


1460 


casein kinase 2, 
















beta polypeptide 


194 


0.62 


f\ CCA 

0.554 


39060_at 


D38048 


TLT— 1 i OAiTC 

Hs. 118065 


CiTAC 

5695 


proteasome 
















Q)rosome, 
















macropain) 
















suDuDii, oeia type. 


195 


U.oz 




4041z_at 


AA2U34 /O 


rLS.232Do/ 


92^2 


7 

pixuuary uimor- 
















uansiOiimng i 


196 






37729_at 


Y08ol4 


TT— ^AAAA 

Hs.79090 


7514 


exportm 1 (L/KMl, 
















yeast, homolog) 


197 


A /TO 


A ceo 

0.552 


38863_at 


L07540 


Hs.171075 


5985 


repUcation factor C 
















(activator 1) 5 


















198 


0.62 


0.551 


37726_at 


X06323 


Hs.79086 


11222 


mitochondrial 
















ribosomal protein 
















L3 


199 


0.62 


0.551 


41003_at 


U41S16 


Hs.91161 


5203 


prefoldin 4 


200 


0.62 


0.550 


592_at 


M34079 


Hs.250758 


5702 


proteasome 



(prosome, 
macropain) 26S 
subunit, ATPase, 3 

Table 2: C2 Markers 

[00132] The C2 class is a robust class of markers. According to the invention, 
preferred markers are markers 1-30, preferably 1-20, and more preferably 1-10. Highly 
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Identifier 
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kallikrein 11 


1 


1 A/\ 


W. /Ol 




AB012917 Hs.57771 


11012 


Z 


1.27 


U. I D\j 


40544_g_at 






429 


achaete-scute 














complex 
















(Drosophila) 
















homolog-like 1 


3 


1.27 


0.721 


3oDUo_at 




tie l^'\ftCi 


1363 


carboxypeptidas 

eE 


4 


1.21 


0.715 


31477_at 


L08044 


Hs.82961 


7033 


trefoil factor 3 












(intestinal) 


5 


lie 


U. /Uo 




X02330 






calcitonin/calcit 














onrn-related 
















polypeptide. 
















alpha 


u 


1 17 
1.1/ 




/lA/^AO of 


X64810 


Hs.78977 


5122 


proprotein 














convCTtase 
















subtilisin/kexin 
















type 1 


7 


1.16 




>|ylO of 


X15187 


Hs.82689 


7184 


tumor rejection 












antigen (gp96) 1 


8 


1.05 


V/.OUv 


1/i^AA of 


XI 5943 


Hs.37058 


796 


calcitonin/calcit 










oniti-related 
















polypeptide. 
















alpha 


9 


1.02 




aA'SIO 0+ 

3933Z_at 


AF035316 Hs.336780 


7280 


tubulin, beta 












polypeptide 


10 


0.97 


0.651 


39756 _g at 


Z93930 


Hs.149923 


7494 


X-box binding 














protein 1 


11 


0.96 


0.647 


39135„at 


AB018310 Hs.95180 


23151 


KIAA0767 












protein 


12 


0.95 


0.645 


34785_at 


AB028948 Hs.4084 


23389 


KIAA1025 












protein 


13 


0.92 


0.644 


37617_^at 


U90912 


Hs.81897 


54462 


KIAA1128 












protein 


14 


0.85 


0.630 


1788_s_at 


U48807 


Hs.2359 


1846 


dual specificity 










phosphatase 4 


15 


0.85 


0.630 


37928_^at 


AA62155 


Hs.84928 


4801 


nuclear 






5 






transcription 
















factor Y, beta 



0302927aA2J_> 
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16 



17 
18 



s2n_obs Perm non_norm_list GB/TIGR UNIGENE LL^num 
0.1% Identifier (as of 

summer 
2001) 

0.84 0.625 37141 at U39840 Hs.299867 3169 



0.84 0.623 35995_at AF067656 Hs.42650 
0.83 0.622 40201 at M76180 Hs.l50403 



19 0.82 0.620 35800 at D63391 Hs.6793 



20 0.8 0.618 33543 s at U77718 Hs.44499 



21 0.8 



0.615 1822 at 



HG4677- 
HT5102 



22 0.79 0.613 35343 at M37400 Hs.597 



23 0.78 0.610 41403 at AI032612 Hs.l05465 



25 0.77 0.605 39113 at AI262789 Hs.93659 



11130 
1644 



5050 



5411 



2805 



6636 



24 0.78 0.606 37426 at U80736 Hs. 110826 27324 



9601 



26 0.77 0.604 40881_at X64330 Hs.l74140 47 

27 0,77 0.603 32137 at AF029778 Hs.l66154 3714 



Desc 

(imigene/locusli 
nkoraffy) 

hepatocyte 
nuclear factor 3, 
alpha 

ZWlOinteractor 
dopa 

decarboxylase 
(aromatic L- 
amino acid 
decarboxylase) 
platelet- 
activating factor 
acetylhydrolase, 
isoform lb, 
ganuna subunit 
(29kD) 
pinin, 

desmosome 
associated 
protein 
Oncogene 
Ret/Ptc2, Fusion 
Activated 
glutamic- 
oxaloacetic 
transaminase 1, 
soluble 
(aspartate 
aminotransferas 
el) 

small nuclear 
ribonucleoprotei 
n polypeptide F 
trinucleotide 
repeat 
containing 9 
protein disulfide 
isomerase 
related protein 
(calcium- 
binding protein, 
intestinal- 
related) 
ATP citrate 
lyase 
jagged 2 
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28 



30 
31 
32 



s2n_obs Perm non_norm_list GB/TIGR UNIGENE LL_niiiii 
0.1% Identifier (as of 

summer 
2001) 

0.77 0.600 34690 at U66616 Hs.236030 6601 



29 0.77 0.599 41395 at AB003791 Hs.l 04576 



0.76 
0.76 
0.76 



33 0.75 



0.599 39891_at 

0.598 41250_at 

0.598 37545_at 

0.597 41146 at 



AI246730 Hs.126901 

U24169 Hs.301613 

W22110 Hs.7934 

J03473 Hs.177766 



34 0.74 0.597 40865_at U51166 Hs.l73824 

35 0.74 0.597 35147 at AB002360 Hs.25515 



36 0.74 0.591 36847_r_at AA12150 Hs.70830 

9 



39 0.72 0.586 38654 at X65488 Hs.l03804 



40 0.72 0.583 37359 at 



D14658 Hs.77665 



8534 



7965 
9314 

142 



6996 



23263 



51690 



37 0.73 0.588 37293_at D43948 Hs.76989 9793 

38 0.73 0.587 36482 s at Y15724 Hs.5541 489 



3192 



9789 



Desc 

(unigene/locusli 
nkoraffy) 

SWI/SNF 
related, matrix 
associated, actin 
d^endent 
regulator of 
chromatin, 
subfamily c, 
member 2 
carbohydrate 
(keratan sulfate 
Gal-6) 

sulfotransferase 
1 

cDNA, 3 end 
JTVl gene 
Kruppel-Uke 
factor 4 (gut) 
ADP- 

ribosyltransferas 

e(NAD+;poly 

(ADP-ribose) 

polymerase) 

thymine-DNA 

glycosylase 

MCF.2 cell line 

derived 

transforming 

sequence-like 

U6 snRNA- 

associated Sm- 

like protein 

LSm7 

KIAA0097 gene 

product 

ATPase, Ca++ 

transporting, 

ubiquitous 

heterogeneous 

nuclear 

ribonucleoprotei 
n U (scaffold 
attachment 
factor A) 
BaAA0102 gene 
product 
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41 

42 
43 



s2ii_obs Penn non_nonn_list GB/TIGR UNIGENE LL_nuin 
0.1% Identifier (as of 

summer 
2001) 

0.72 0.582 37638 at D50857 Hs.82295 



0.72 
0.71 



44 0.71 



0.582 39824_at 
0.580 37019_at 

0.578 40074 at 



AI391564 Hs.l 10820 
J00129 Hs.7645 

X16396 Hs.154672 



45 0.71 0.576 40584_at Y08612 Hs.l72108 

46 0.7 0.576 33266_at AF015254 Hs.l80655 

47 0.69 0.575 36008_at AF041434 Hs.43666 

48 0.69 0.574 37333_at X63692 Hs.77462 

49 0.69 0.574 1660 at D83004 Hs.75355 



50 0.69 0.573 36149_at D78014 Hs.74566 

51 0.68 0.573 39692_at AL080209 Hs.l3659 

52 0.68 0.570 40317 at U57352 Hs.6517 



53 0.67 0.568 31906 at AF068754 Hs.250899 



1793 

2244 
10797 



4927 
9212 
11156 

1786 

7334 



1809 
64764 

40 



3281 



Desc 

(unigene/locusli 
nkor ai^) 

dedicator of 

cyto-kinesis 1 

cDNA, 3 end 

fibrinogen, B 

beta polypeptide 

methylene 

tetrahydrofolate 

dehydrogenase 

(NAD+ 

dependent), 

methenyltetrahy 

drofolate 

cyclohydrolase 

nucleoporin 

88kD 

serine/threonine 
kinase 12 
protein tyrosine 
phosphatase 
typelVA, 
member 3 
DNA (cytosine- 
5-)- 

methyltraosferas 
el 

ubiquitin- 
conjugating 
enzyme E2N 
(homologous to 
yeast UBC13) 
dihydropyrimidi 
nase-like 3 
hypothetical 
protein 

DKFZp586F242 
3 

amiloride- 
sensitive cation 
channel 1, 
neuronal 
(degenerin) 
heat shock 
fector binding 
protein 1 
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54 0.67 



s2n_obs Perm non_nonn_list GB/TICTt UNIGENE 
0.1% Identifier (as of 

summer 
2001) 

0.567 149 at U90426 Hs.l79606 



57 0.66 

58 0.66 

59 0.66 



60 0.66 

61 0.65 

62 0.65 

63 0.65 

64 0.65 

65 0.65 

66 0.64 



LL vma 



10212 



55 0.67 0.567 38978_at AF013758 Hs.l09643 10605 



56 0.67 0.565 35566_f_at AF015128 Hs.301365 



67 0.64 

68 0.64 



0.564 36745_at AF035308 Hs.l67036 
0.563 36133_at AL031058 Hs.74316 
0.563 35966 at X71125 Hs.79033 



0.562 37955_at AB015631 Hs.8752 
0.562 40846 g at U10324 Hs.256583 



0.560 37101 at 



AL050008 Hs.306186 



0.559 40580_r_at M24398 Hs.l71814 

0.559 36489_at D00860 Hs.56 

0.558 37133_at AF027406 Hs.l04865 

0.557 33714_at Y10043 Hs.l9114 



0.557 35351_at U89505 Hs.6106 
0.557 41829 at AB018274 Hs.6214 



1832 

25797 



10330 

3609 

25855 

5763 
5631 

26576 
3149 



5936 
23367 



Desc 

(unigene/iocusli 
nkoraffy) 

nuclear RN A 
helicase,DECD 
variant of 
DEAD box 
family 

polyadenylate 
binding protein- 
interacting 
protein 1 
IgG heavy chain 
variable region 
(Vh26) 

clone 23798 and 
23825 

desmoplakin 
(DPI, DPn) 
glutaminyl- 
pq)tide 

cyclotransferase 

(glutaminyl 

cyclase) 

transmembrane 

protein 4 

interleukin 

enhancer 

binding factor 3, 

90kD 

DKFZP564A06 

3 protein 

parathymosin 

phosphoribosyl 

pyrophosphate 

synthetase 1 

serine/threonine 

kinase 23 

high-mobiUty 

group 

(noidiistone 
chromosomal) 
protein 4 
RNA binding 
motif protein 4 
jaAA0731 
protein 
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69 



s2n_obs Perm non_norm_list GB/TIGR UNIGENE LLjium 
0.1% Idaitifier (as of 

summer 
2001) 

0.64 0.555 39158 at AB021663 Hs.9754 22809 



70 


0.64 


0.555 


35163_at 


71 


0.64 


0.555 


36406_at 


72 


0.63 


0.554 


32149_at 


73 


0.63 


/\ CCA 

0.554 


32825_at 


74 


0.63 


0.553 


35590_s_; 


75 


0.63 


0.553 


36636_at 


76 


0.63 


0.553 


37944_at 



AB028964 Hs.26023 

AA40139 Hs.165296 
7 

AA53249 Hs.l83752 
5 

Y10805 Hs.20521 



79 0.62 0.550 33162_at 

80 0.62 0.549 31586Xat 

81 0.62 0.549 34289Xat 

82 0.62 0.549 36615 at 



22887 
26085 
4477 

3276 



4942 



2643 



77 0.63 0.552 41083_at AC006276 Hs.99093 

78 0.62 0.550 39317 at D86324 Hs.24697 8418 



X02160 Hs.89695 3643 

X72475 Hs.156110 3514 

D50920 Hs.23106 9862 

M83751 Hs.75412 7873 



Desc 

(unigene/locusli 
nk or affy) 

activating 
transcription 
factor 5 
KIAA1041 
protein 
kallikrein 13 

microseminopro 

tein, beta- 

HMTl (hnRNP 

methyltransferas 

e, S. cerevisiae)- 

like 2 

gastric 

inhibitory 

polypeptide 

receptor 

ornithine 

aminotransferas 

e (gyrate 

atrophy) 

GTP 

cyclohydrolase 

1 (dopa- 

responsive 

dystonia) 

chromosome 19, 

cosmidR28379 

cytidine 

monophosphate- 
N- 

acetyhieuramini 
c acid 

hydroxylase 
(CMP-N- 
acetyhieuramina 
te 

monooxygenase 

) 

insulin receptor 

immunoglobulin 

kappa constant 

KIAA0130 gene 

product 

Arginine-rich 

protein 
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83 



s2n_obs Penn non^normjist GB/TIGR UNIGENE LL_nimi 
0.1% Identifier (as of 

sununor 
2001) 

0.62 0.546 904_s_at L47276 



84 0.62 0.545 39791_at M23114 Hs.l526 

85 0.62 0.544 36203„at X16277 Hs.75212 

86 0.61 0.544 1582^at M29540 Hs.220529 

87 0.61 0.544 38456_s_at AL049650 Hs.83753 



88 0.61 0.544 39610_at 

89 0.61 0.544 37272_at 



90 0.61 0.544 36185_at 

91 0.61 0.544 38435_at 



X16665 Hs.2733 
X57206 Hs.78877 



D32050 Hs.75102 
U25182 Hs.83383 



488 

4953 
1048 



6628 



3212 
3707 



16 
10549 



92 0.6 0.544 32447_at U76388 Hs.l57037 2516 



93 0.6 0.544 38753_at AF039022 Hs.85951 11260 



94 0.6 0.543 38248_at AB011124 Hs.90232 9762 

95 0.6 0.543 38719__at U03985 Hs.l08802 4905 



Desc 

(xinigene/locusli 
nk or affy) 

(cell line HL- 
60) alpha 
topoisomerase 
truncated-form 
mKNA,3UTR 
ATPase, Ca++ 
transporting, 
cardiac muscle, 
slow twitch 2 
ornithine 
decarboxylase 1 
carcinoembryon 
ic antigen- 
related cell 
adhesion 
molecule 5 
small nuclear 
ribonucleoprotei 
n polypeptides 
BandBl 
homeo box B2 

inositol 1,4,5- 

trisphosphate 3- 

kinaseB 

alanyl-tRNA 

synthetase 

thioredoxin 

peroxidase 

(antioxidant 

enzyme) 

nuclear receptor 

subfamily 5, 

group A, 

member 1 

exportin, tRNA 

(nuclear export 

receptor for 

tRNAs) 

KIAA0552 gene 

product 

N- 

ethyhnaleimide- 
sensitive factor 
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96 



s2n_obs Perm non_nonn_list GB/TIGR UNIGENE LL_num 
0.1% Identifier (as of 

suimner 
2001) 

0.6 0.543 34105 f at AI147237 Hs.300697 3502 



97 0.6 0.543 40840 at M80254 Hs.l73125 10105 



HG4679- 
HT5104 



98 0.6 0.542 1745_at 

99 0.59 0.542 1884 s at Ml 5796 Hs.78996 



100 0.59 0.542 31935 s at U75968 Hs.27424 



103 0.59 0.542 38340 at AB014555 Hs.96731 



104 0.58 0.542 1796 s at U05681 



105 0.58 0.542 34726 at U07139 Hs.250712 



106 0.58 0.541 35253 at AB011143 Hs.30687 



107 0.58 0.541 35151 at AF089814 Hs.25664 



5111 



1663 



101 0.59 0.542 34933_at AJ238381 Hs.l32576 5083 

102 0.59 0.542 33304 at U88964 Hs.l83487 3669 



9026 



784 



9846 



10263 



Desc 

(unigene/Iocusli 
nk or affy) 

immunoglobulin 
heavy constant 
ganuna 3 (G3m 
marker) 
peptidylprolyl 
isomerase F 
(cyclophilinF) 
Oncogene 
Ret/Ptc, Fusion 
Activated 
proliferating 
cell nuclear 
antigen 

DEAD/H (Asp- 
Glu-Ala- 
Asp/His) box 
polypeptide 11 
(S.cerevisiae 
CHLl-like 
helicase) 
paired box gene 
9 

interferon 

stimulated gene 

(20kD) 

huntingtin 

interacting 

protein-l- 

related 

B-cell 

CLL/lymphoma 
3 

calcium 
channel, 
voltage- 
dependent, beta 
3 subunit 
GRB2- 
associated 
binding protein 
2 

tumor 
suppressor 
deleted in oral 
cancer-related 1 
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aon nonn list GB/TIGR 


UNIGEME 


LLjaum 






0.1% 




Identifier 


(as of 














summer 














2001) 




108 


0.58 


0.541 


38635_at 


Z69043 


Hs.102135 


6748 


109 


0.58 


0.541 


39040_at 


W28360 


Hs.184325 


51632 


110 


0.57 


0.541 


38860_at 


U66346 


Hs.189 


5143 


111 
ill 


U.J / 


ft Sdl 


1432_s_at 


D16105 


Hs.210 


4058 


Hz 


U.J / 


ft 


36851_A_at 


U42360 






113 


0.57 


0.540 


37985_at 


L37747 




5901 


114 


0.57 


0.540 


38708_at 


AF054183 


Hs. 10842 


115 


0.57 


0.540 


jZ4U4_ax 


AF065314 Hs.234785 


1261 


116 


0.57 


0.540 


36970_at 


D80004 


Hs.75909 


23199 


117 


0.57 


0.540 


32646_at 


AB007918 Hs.169182 


23046 


118 


0.57 


0.539 


32485 at 


X00371 


Hs. 11 8836 


4151 


119 


0.57 


0.538 


37774 at 


AI819942 


Hs.90998 


23157 


120 


0.57 


0.538 


36153_at 


L13848 


Hs.74578 


1660 


121 


0.57 


0.538 


288_s_at 


L25931 


Hs.152931 


3930 



122 0.56 

123 0.56 



0.538 33347_at 
0.538 33399 at 



AA88386 Hs.216354 6048 
8 

AA14294 Hs.241507 6194 
2 



Desc 

(uoigeneAocusli 
nkorafiy) 

signal sequence 

receptor, delta 

(translocon- 

associated 

protein delta) 

CGI-76 protein 

phosphodiestera 

se4C,cAMP- 

specific (dunce 

(Drosophila)- 

homolog 

phosphodiestera 

seEl) 

leukocyte 

tyrosine kinase 

Putative 

prostate cancer 

tumor 

suppressor 

laminBl 

RAN, member 

RAS oncogene 

family 

cyclic 

nucleotide gated 

channel alpha 3 

KIAA0182 

protein 

KIAA0449 

protein 

myoglobin 

septin2 

DEAD/H (Asp- 
Glu-Ala- 
Asp/His) box 
polypeptide 9 
(RNAhehcase 
A, nuclear DNA 
hehcase U; 
leukophysin) 
laminB 
receptor 
ring finger 
protein 5 
ribosomal 
protein S6 
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s2n_obs Penn non_nonn_list GB/TIGR UNIGENE 
0.1% Identifier (as of 

summer 
2001) 

124 0.56 0.538 1888 s at X06182 Hs.81665 



125 0.56 0.538 1846 at L78132 Hs.4082 



126 0.56 0.537 34338 at D49738 Hs.31053 



127 0.56 0.537 41241 at D84273 Hs.l81311 



128 0.56 0.536 35670 at M37457 



129 0.56 0.536 41399_at AB029034 Hs.285641 

130 0.55 0.536 36676_at AL031659 Hs.75722 

131 0.55 0.536 39927 at U17032 Hs.267831 



132 0.55 

133 0.55 

134 0.55 



0.536 1257_s_at 
0.535 37576 at 



L42379 Hs.77266 
U52969 Hs.80296 



0.535 34987 s at X79536 Hs.249495 



135 0.55 0.535 1798 at 



U41060 Hs.79136 



136 0.55 0.535 40674_s_at S82986 Hs.820 

137 0.55 0.535 39342 at X94754 Hs.279946 



LL num 



3815 



3964 



1155 



4677 



23133 
6185 

394 



5768 
5121 

3178 



25800 



3223 
4141 



Desc 

(unigen^ocusli 
nkoraffy) 

v-kit Hardy- 

Zuckennan 4 

feline sarcoma 

viral oncogene 

homolog 

prostate 

carcinoma 

tumor antigen 

(pcta-1)/ lectin 

cytoskeleton- 

associated 

protein 1 

asparaginyl- 

tRNA 

synthetase 

ATPase, 

Na+/K+ 

transporting, 

alpha 3 

polypeptide 

KIAAllll 

protein 

growth hormone 
releasing 
hormone 
Rho GTPase 
activating 
protein 5 
quiescin Q6 
Purkinje cell 
protein 4 
heterogeneom 
nuclear 

ribonucleoprotei 
nAl 

LIV-1 protein, 

estrogen 

regulated 

homeo box C6 

methionine- 

tRNA 

synthetase 
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s2n_obs Perm non_nonnJist GB/TIGR UNIGENE 
0.1% Identifier (as of 

suiniher 
2001) 

138 0.55 0.535 38707_r_at S75174 Hs.l08371 



LLjaum Desc 

(xinigene/locusli 
nkoraj^) 



139 0.55 0.535 34648_at Z12830 Hs.250773 



140 0.54 0.535 40653_at U32439 Hs.79348 



141 0.54 0.534 34827„at AF045458 Hs.47061 



142 0.54 0.534 36178_at U23143 Hs.75069 



143 0.54 0.534 34264_at 

144 0.54 0.534 41750_at 



145 0.54 0.534 36971_at 

146 0.54 0.534 38399_at 



AB026894 Hs.226499 
D49489 Hs.182429 



D87446 Hs.75912 
AL034428 Hs.82575 



147 0.54 0.534 32190_at AL050118 Hs.l84641 

148 0.54 0.534 38835_at U94831 Hs.91586 



1874 



6745 



6000 
8408 
6472 



23623 
10130 



23505 
6629 

9415 
10548 



149 0.54 0.533 37316_r_at AI057607 Hs.7731 55837 



E2F 

transcription 
factor 4, 
pl07/pl30- 
binding 

signal sequence 
receptor, alpha 
(translocon- 
associated 
protein alpha) 
regulator of G- 
protein 
signalling 7 
unc-51 (C. 
elegans)-like 
kinase 1 
serine 

hydroxymethylt 
raosferase 2 
(mitochondrial) 
nesca protein 
protein disulfide 
isomerase- 
related protein 
KIAA0257 
protein 
small nuclear 
ribonucleoprotei 
n polypeptide 
B" 

fatty acid 
desaturase 2 
traosmembrane 
9 superfamily 
mraiber 1 
uncharacterized 
bone marrow 
protein BM036 



Table 3: C3 Markers 



[001331 According to the invention, preferred markers are markers 1-30, preferably 1- 
20, and more preferably 1-10. 
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Class C3 





s2n 0 


Perm 


non nonn list GB/TIGR 


UNlGEJSLb 


LL_nuni 


Desc 




bs 


0.1% 




Idratifier 


(as of 
suminer 
2001) 




(umgene/locuslink 
or any) 


1 


1.42 


0.866 


37669_s_at 


U16799 


Hs.78629 


481 


Alrase, JNa+ZK-r 
transporting, beta 1 
polypeptide 


2 


1.2 


0.724 


ioUoo at 




Hs.4984 


23382 


KIAA0828 protein 


3 


1.17 


0.707 


33699_at 


M18667 






progastricsin 
(pepsinogen C) 


4 


1.06 


0.706 


1081_at 


M33764 


Hs.75212 


4953 


ornithine 
decarboxylase 1 


5 


1.06 


0.688 


33396_at 


U12472 


Hs.226795 


2950 


glutathione S- 

transferase pi 


6 


1.06 


0.679 


34319_at 


AA131149 


Hs.2962 


6286 


SI 00 calcium- 
binding protein P 


7 


1.02 


0.674 


40409_at 


U46689 


Hs.159608 


224 


. aldehyde 
dehydrogenase 10 
(fatty aldehyde 
dehydrogenase) 


8 


1.02 


0.673 


32805_at 


U05861 






aldo-keto reductase 
family 1, member 
CI (dihydrodiol 
dehydrogenase 1; 
20-aipna (i-alpna)- 
hydroxysteroid 
dehydrogenase) 


9 


0.99 


0.667 


33383_f_at 


AI820718 


Hs.250505 


5914 


retinoic acid 
receptor, alpha 


10 


0.98 


0.663 


35207_at 


X76180 


Hs.2794 


6337 


sodium channel, 
nonvoltage-gated 1 
alpha 


11 


0.98 


0.655 


33052_at 


U95301 


Hs. 144442 


8399 


phospholipase A2, 
group X 


12 


0.98 


0.649 


38526_at 


U02882 


Hs.172081 


5144 


phosphodiesterase 
4D, cAMP-specific 
(dunce 

(Drosophila)- 
homolog 

phosphodiesterase 
E3) 


13 


0.97 


0.646 


38066_at 


M81600 






diaphorase 
(NADH/NADPH) 
(cytochrome b-5 
reductase) 


14 


0.93 


0.644 


1882_g_at 


HG4058- 
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phosphodiesterase 
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Hs.88778 
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carbonyl reductase 
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HT26388 






Epithelial, Alt. 
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J02761. 


Hs.76305 


6439 


surfactant, 








pulmonary- 








associated protein B 


Z49835 


Hs.289101 


2923 


glucose regulated 








protein, 58kD 


U10868 


Hs.83155 
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aldehyde 








dehydrogenase 7 


M72393 


Hs.211587 


5321 


phosphoHpase A2, 








groi5)IVA 








(cytosolic, calcium- 








dependent) 


AB028972 


Hs.227835 
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K[AA1049protem 


AB029027 


Hs.279039 
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KIAAl 104 protein 


J05581 
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mucin 1, 








transmembrane 


D15050 


Hs.232068 


6935 


transcnption factor 
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interleukin 2 








expression) 


U29344 


Hs.83190 
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fatty acid synthase 


N74607 


Hs.234642 
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aquaporin 3 


M58286 


Hs.159 


7132 


tumor necrosis 








factor receptor 








superfamily, 
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U59185 


Hs.23590 
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solute carrier family 








16 (monocarboxylic 








acid transporters), 








membCT4 
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Hs.9006 
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associated 








membrane protein)- 








associated protein A 
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ankyrin repeat- 








containing protein 
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enaopiasmic 
















rencuium lumenai 


















48 


0.7 


0.608 


1662_r_at 


HG2261- 
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oxygen regulated 








protein (150kD) 
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epoxide hydrolase 








1 microsomal 








^xenobiotic^ 
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Hs.28309 


7358 


UDP-glucose 








dehydrogenase 


UiVDDxJ 


Wq 70099 
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GTP-bindine 
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nvprexnref>55ed in 








skeletal muscle 
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protease, serine 1) 
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retinoic acid 








inHiiPf^ 3 


A0Z0D4 
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/ J J J 
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E2 variant 1 
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X1.0*-i/«/77 


1955 


EGF-like-domain, 








multinle 5 
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potassium channel, 








siibfamilv K. 








member 1 (TWIK- 








w 




Wq <^9 1 1 ^ 


23221 


KIAA0717 protein 




W<5 S7S 


218 


aldehyde 








dehvdroeenase 3 


JVLZ 1 oOo 


Wc 1 1 R940 


10564 


brefeldin A- 








inhibited guanine 








nucleotide- 








exchange protein 2 


M84424 






cathepsin E 
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37 


acyl-Coenzyme A 








dehydrogenase. 
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Hs.75564 
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interacting, 
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kinase 21) 
procollagen-proline, 
2-oxoglutarate 4- 
dioxygenase 
(proline 4- 
hydroxylase), beta 
polypeptide (protein 
disulfide isomerase; 
thyroid hormone 
binding protein 
p55) 
nicastrin 

activated leucocyte 
cell adhesion 
molecule 
ribosomal protein 
S6 kinase, 90kD, 
polypeptide 2 
ras homolog gene 
family, member B 
transforming, acidic 
coiled-coil 
containing protein 2 
clone 

DKFZp586C1019 
destrin (actin 
depolymerizing 
factor) 

hypothetical protein 
SMAP31 

similar to vaccinia 
virus HindmK4L 
ORF 
aldehyde 
dehydrogenase 2, 
mitochondrial 
clone RPll- 
127K18 
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Table 4: C4 Markers 

[00134] According to the invention, preferred markers are markers 1-30, preferably 1- 
20, and more preferably 1-10. Highly preferred markers are cathepsin H, folate receptor 1 
(adult), BENE protein, and cytochrome b-5. 
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n CO 
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methvltrausferase 
















reductase 
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invariant gamma- 
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0.87 


0.635 


1 <00 o of 


JlvJJlo /- 
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HT3366 






Phosphatase 1, Non- 
















Receptor, Alt. SpUce 
3 


11 


0.87 


0.632 


37512_at 


U89281 


Hs.11958 


8630 


oxidative 3 alpha 














hydroxysteroid 
















dehydrogenase; 
















retinol 
















dehydrogenase; 3- 
















hydroxysteroid 
















epimerase 


12 


0.86 


0.631 


38459_g at 


L39945 






cytochrome b-5 


13 


0.86 


0.631 


36965_at 


U13616 


Hs.75893 


288 


ankyrin 3, node of 














Ranvier (ankyrin G) 


14 


0.85 


0.630 


593_s_at 


M34353 


Hs.1041 


6098 


v-ros avian IIR2 
















sarcoma virus 
















oncogene homolog 1 
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transmembrane 


22 


0.8 
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Hs.2 11595 
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phosphatase, non- 
















receptor type 13 
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associated 
















phosphatase) 


23 


0.8 


0.595 
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Hs.1790 
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nuclear receptor 
















subfamily 3, group C, 
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34823_at 


X60708 
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deaminase 
















complexing protein 2) 
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similar to rat integral 
















membrane 
















glycoprotein 
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30 


0.77 


0.578 


38984_at 
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Hs.110 
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putative L-type 
















neutral amino acid 



transporter 
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35 


0.76 


0.571 


34996_at 


U75329 


Hs.318545 


7113 


transmembrane 
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38 
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39 
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whey-acidic protein 
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carcinoma marker 
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epoxiue nyaroiase i, 

microsomal 
^AcnuuiO lie ) 


67 


0.68 


0.540 


36508 at 


AF030186 


Hs.58367 


2239 


glypican 4 


68 


0.68 


0.540 


33942_s_at 


AF004563 


Hs.239356 


6812 


syntaxin binding 
protein 1 


69 


0.67 


0.540 


37629_at 


M55268 


Hs.82201 


1459 


casein kinase 2, alpha 
prime polypeptide 



70 
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(as of 




(imigeneAociisliiik or 
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afEy) 












2001) 






70 


0.67 


0.539 


32822_at 


J02966 


Hs.2043 


291 


solute carrier family 
















25 (mitochondrial 
















carrier; adenine 
















nucleotide 
















translocator), member 
4 


71 


0.67 


0.538 


35472_at 


Y10745 


Hs.17287 


3772 


potassium inwardly- 
















rectifying channel, 
















subfjmily J, member 


72 


0.67 


0.537 


34163 g at 


D84111 


Hs.80248 


11030 


15 

RNA-binding protein 














gene with multiple 
















spUcing 


73 


0.67 


0.536 


31925_s_at 


L26584 


Hs.169350 


5923 


Ras protein-specific 
















guanine nucleotide- 
















releasing factor 1 


74 


0.67 


0.536 


32854_at 


AB014596 


Hs.21229 


23291 


f-box and WD-40 
















domain protein IB 


75 


0.67 


0.535 


35645_at 


AL050148 


Hs.31834 




clone 
















DKFZp586G1520 


76 


0.66 


0.535 


1986_at 


X74594 


Hs.79362 


5934 


retinoblastoma-like 2 
















(pl30) 


77 


0.66 


0.533 


1938_at 


K03218 






v-src avian sarcoma 
















(Schmidt-Ruppin A- 
















2) viral oncogene 
















homolog 


78 


0.66 


0.532 


1616_at 


D14838 


Hs.111 


2254 


fibroblast growth 
















factor 9 (glia- 
















activating factor) 


79 


0.66 


0.532 


41440_at 


D82061 


Hs.288354 


7923 


FabG (beta-ketoacyl- 
















[acyl-carrier-protein] 
















reductase, E coli) like 


80 


0.66 


0.530 


41129_at 


D26067 


Hs. 174905 


23027 


KIAA0033 protein 


81 


0.66 


0.530 


40209_at 


U72671 


Hs.151250 


7087 


mterceUular adhesion 
















molecule 5, 
















telencephalin 


82 


0.65 


0.529 


32676_at 


M93405 


Hs.293970 


4329 


methyhnalonate- 
















semialdehyde 
















dehydrogenase 


83 


0.65 


0.528 


36557_at 


M92303 


Hs.635 


782 


calcium channel. 
















voltage-dependent, 
















beta 1 subunit 


84 


0.65 


0.528 


35228_at 


Y08682 


Hs.29331 


1375 


carnitine 
















pahnitoyltransferase 
















I, muscle 
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0.1% 


St 


Identifier 


85 






1667_s_at 


J02871 


86 


0.65 


0.526 


40701_at 


U75362 


87 


U.OD 




40343_at 


AJ005814 


OO 


0.65 


0.524 


jyjui at 


AojUiU 


89 


0.65 


0.524 


35435_s_at 


AF001903 


90 


U.O*f 


n KOI 


34235_at 


AB018301 


01 


0.64 


0.523 


J / j44_at 


A0Z744 


92 


U.O*f 




41120_at 


D14686 


93 


0.64 


0.522 


40673_at 


U12778 


94 


0.63 


0.521 


34353 at 


AB014548 


95 






35285_at 


AF007216 


96 


0.63 


0.520 


40822_at 


L41067 


97 


0.63 


0.519 


4133 l_at 


R53981 


OS 
yo 


0.63 


0.519 


ylAT7Q ^4- 

hUZ /a_St 




99 


0.63 


0.519 


36828 at 


AB002324 


100 


0.63 


0.519 


40128_at 


D79993 


101 


0.63 


0.519 


35382_at 


AF043244 



UJNlLrlilNXi 


LLjmm 


Desc 


^aS> UX 




^umgcne/iocusuiiK or 


bUXULLUCl 




any; 










uou 


cyiocnroinc jLtJi/, 






suDiamily IV Id, 






poiypcpnue i 




oy /J 


UDiq^uiuii spcciuc 






proicasc 1^ 






(isopeptidasc T-3) 


TT_ '7AQ<>1 

lis. /uy 04 


i2U4 


homeo box A7 


ilS.4U:5UU 




calpain 3, (py4j 


We ft 1 1 n 




L-3 -hydroxyacyl- 






^oenzyiiie /v 






aeiiyurogciiase, scion 






chain 




9^^989 
ZOZoZ 


]SJJ\J\\J / Do pTOieiD 


TTq 77S99 




major 






histocompatibility 






complex, class n, DM 






alpha 






aminomethyltransfera 






se (glycine cleavage 






systOTi protein T) 


JlS.Oll/OH' 




acyl-Coenzyme A 






aenyarogenase. 






short/branched chain 


TT-, o 1 no 1 


z3z44 


KIAA0o48 protem 


JtlS.D40Z 


OO/l 


solute carrier family 






4, sodium bicarbonate 






cotransporter. 






i member 4 


rlS. 1 /ZO /4 


4/ /J 


nuclear factor of 






aciivaiea i">ceiis. 






cytoplasmic. 






calcineurin-dependent 


£is.z4z /y 


yoou 


j?sJAAUoUo gene 






product 


riS.15DMo 


Ol AiCO 

23062 


KIAA1080 protem; 






Golgi-associated, 






gamma-adaptin ear 






coniainmg, akt- 






omcung proiem z 


ttq 004 






Hs.132853 


9685 


KIAA0171 gene 






product 


Hs.278439 


8996 


nucleolar protein 3 






(apoptosis repressor 






with CARD domain) 
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GB/TIGR 






0.1% 


St 


Identifier 


102 


0.63 


0.518 


40217_s_at 


U65887 


103 


0.63 


0.518 


38095_i_at 


M83664 


104 


0.62 


0.518 


34555_at 


X63755 


105 


0.62 


0.517 


33263_at 


X67098 


106 


0.62 


0.517 


33267 at 


AF035315 


107 


0.62 


0.517 


1594_at 


J05448 


108 


0.62 


0.516 


40013_at 


Y12696 


109 


0.62 


0.516 


32122 at 


L31573 


110 


0.62 


0.515 


34800lat 


AL039458 


111 


0.62 


0.515 


41723_s_at 


M32578 


112 


0.62 


0.515 


38683 s at 


AB029008 


113 


0.62 


0.514 


32235 at 


AB011116 


114 


0 62 


0 514 


41689_at 


R16035 


115 


0.62 


0.514 


38318 at 


AL050128 


116 


0.61 


0.513 


1619 g at 


D21241 


117 


0.61 


0.513 


39266 at 


AF070632 


118 


0.61 


0.513 


4071 l_at 


AL049340 


119 


0.61 


0.512 


39247_at 


U66689 


120 


0.61 


0.512 


39820_at 


AF001549 



UNIGENE 


LLjium 


Desc 


(as of 




(umgene/locuslink or 


summer 




affy) 


2001) 






Hs.152981 


1040 


CDP-diacylglycerol 






synthase 






(phosphatidate 






cytidylyltransferase) 
1 


Hs.814 


3115 


major 






histocompatibility 






complex, class II, DP 






betal 


Hs.2743 


3846 


keratia, cuticle. 






ultrahigh sulphur 1 






rTS beta protein 


Hs.180737 




clone 23664 and 






23905 


Hs.79402 


5432 


polymerase (RNA) II 






(DNA directed) 






polypeptide C (33kD) 


Hs.54570 


1193 


chloride intracellular 






channel 2 


Hs. 16340 


6821 


sulfite oxidase 


Hs.4193 


26018 


ortholog of mouse 






integral membrane 






glycoprotein LIG-1 


Hs. 180255 


3123 


major 






histocompatibility 






complex, class n, DR 






betal 


Hs.301226 


57450 


KIAA1085 protem 


Hs.284251 


23295 


KIAA0544 protein 


Hs.12701 


51090 


plasmolipin 


Hs.95260 


51439 


Autosomal Highly 






Conserved Protein 






cytochrome P-450 






aromatase 


Hs.23729 




clone 24405 


Hs.86405 




clone 






DKFZp564P056 


Hs.274260 


368 


ATP-binding cassette, 






sub-family C 






(CFTRMRP), 






member 6 


Hs.l 10103 


54700 


KNA polymerase I 






transcription factor 






RRN3 
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0.1% 


ct 
at 


luC/iiiiuci 


( SIC nT 

summer 
2001) 




^^nnip^ftne/lociislinV or 

affy) 


121 


0.61 


0.511 


39974_at 


AF039917 


Hs.47042 


956 


ectonucleoside 
triphosphate 
rfii^lin^iihohvdrolase 3 


122 


0.61 


0.511 


37704_at 


Z14093 


Hs.78950 


593 


branched chain keto 
acid dehydrogenase 
Jc 1 , aipna poiypepuue 
(maple syrup urine 
Qisease ) 




U.Ol 


U.^ lU 


34DZ l_al 






Ql 7^ 


nuiogen-dc 11 V alcu 

V't'VlOCia I^ITIOC^ 

proiein Kinase lundst? 
kinase 13 




u.o 








H<5 R0R4 




Vivnr*tlif*tiP5»1 ■nTotPin 
A TA/;S1SI9 J. 9 1 


1 


0.6 


0.509 


4\J14y ai 


A T AA009 A 


JtlS.lD /44 




Q'Pr9 D Virtmr\1r\rT 

oxi^'^JD numoiog 


126 


u.u 




39138 g at 


X80878 


Hs.95262 


4798 


nuclear factor related 














Wj XLollUa .D UJLULU-JJllg 

proiem 


1 on 


0.6 


0.508 


OoU04 al 




xiS.oUOoU 




majur Vauii pruiwin 


128 


0.6 


0.508 


34473_at 


AF051151 


Hs. 114408 


7100 


toll-like receptor 5 


129 


0.6 


0.508 


36755_s_at 


M75914 


Hs.68876 


3568 


interleukin 5 receptor, 
alpha 




0.6 


0.507 


41000 s at 








CUJNA, D enu 


131 








L48516 


Hs.296259 


5446 


paraoxonase 3 




0.6 


0.507 


QVCK at 
I'UJ) ai 


T^9^7'^ 




S^9S 


2, regulatory subunit 
isofonn 




0.6 


0.506 






TTq 978480 


/ ■j^'j 


(K0X7) 


134 






1270_at 


M64788 


Hs.75151 


5909 


RAPl, GTPase 

Q/^flX/JlflTIO' TiTr^ff^in 1 
dvliValllig piV/ivlll 1 


135 


0.59 


0.506 


1087_at 


M60459 


Hs.89548 


2057 


erythropoietin 

receptor 


1 O/C 

lio 


0.59 


0.505 


iizyu_at 


TV jrn A^ ^1 
M/4iol 


xls.iozj / / 


Kin 


inositol 

polyphosphate-5- 


1 'xn 

ID 1 


0.59 


0.505 


jy4Uo_ai 




Mo 197^1 O 




dehydrogenase, C-2 
to C-3 short chain 








'fU / 00_al 


T T9 A^78 
xJZ^D /o 


TTc 978l=\9^ 


791 


componeni 4r> 


139 


0.59 


0.505 


39612 at 


AL050061 


Hs.27371 




clone DBCF^566J123 


140 


0.59 


0.504 


38850_at 


M11119 


Hs.272951 




endogenous Fetrovirus 
envelope region 
mKNA (PLl) 


141 


0.59 


0.504 


34529 at 


W26760 


Hs.336635 




cDNA 
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U.l /o 


st 


laeiitmer 


(as 01 




(unigene/iocusimK or 












summer 




any; 












2001) 






142 




A Kf\A 
U.jU4 


40394_at 


L17128 


Hs.77719 


2677 


gamma-glutamyl 
















carDoxyiasc 






A ^Al 
U.jUJ 


D/ol l_al 




XlS. iZ /*fjO 


yZJ'r 


CalCiUIIl CndlJIlCiy 
















vouage-Qepenaeiii, 
















a1r%1m 7/Hial+si ciiTiiiTlif 
dipila. ^/UwiUt oUUuiiii 

o 


1 A A 

144 


U.JO 


A ^A^ 
U.jUj 


5/1 jU_at 


AT5AOA1 OA 

A-Duzoiyu 


XlS. 1 uozyu 


Z/ZjZ 


Z 

jveicn niotii 
















coniauiiiig proieui 


145 


U.JO 


A CA'2 

U.jUJ 


A1 1 A/Z 

4134o_at 


A TAATCOI 

AJOO/5o3 


JnLS.zjZzU 


Q71 C 

9zl J 


UJce- 
















glycosyltransferase 


146 


A CO 


A CAO 

U.5Uz 


37oOy_at 


T TA1 O^H 

U01o33 


TT_ 01 /i^n 
rlS.ol4oy 


40oz 


nucleotide binding 
















protein 1 (E.coH 
















jvunu iiKej 


1 /IT 

14/ 




U.jUZ 


5jyoo_i_ai 


AT/ll 7A7^ 
A141 /U/ J 


XJ« AO'XA'X 

jiS.4Z:)4:) 


64 146 


xiypouicucoi proicio 
















T7T T1 AC\A(\ 


14o 


A ^0 

U.Do 


A <A1 

U.jUI 


J/4Z /_ai 


UOOJOJ 


XlS. /Z!7l 1 


14Z1 


crysuiimi, gdnjina jl/ 


1 Af\ 

I4y 


A 

U.JO 


A ^A1 

U.jUI 


37151_at 


AT7AC01 'I A 

Ar0521z0 


xlS.lUo334 




clone Z3o3o 


150 


A CO 

0.58 


A CAi 

0.501 


37172_at 


A/TTCI AiC 

M75100 


TT_ nccT^ 
Ms. / j5 /Z 


13ol 


carboxypeptidase B2 
















(plasma) 


1 CI 

151 


A CO 

0.5o 


A CAA 

0.500 


"3 C01 C «4- 

35ol5_at 


AL04y470 


TT_, O AXl O/l 

rlS.30olo4 


Zj/o/ 


Huntingtin interactmg 
















protein B 


152 


0.58 


0.499 


37722_s_at 


U2o2oo 


Hs.790o4 


1 T^C 

1725 


deoxyhypusine 
















synthase 


153 


A KQ 

U.JO 


A /too 


40600_at 


AW024467 


Hs.172847 


3338 


DnaJ (Hsp40) 
















homology subfamily 
















C, member 4 


1 ^/l 

1 j4 


A 

U.J / 


A AQQ 


joUoo_ai 


A ■DAA7Q'^^ 
/VDUU/yjJ 


rlS.olZj4 


jjZI 


iinmunogxouuiiu 
















supenamiiy, memoer 


155 


A ^'7 
U.J / 


A ylQO 


'3000C n*- 

362oj_at 


A TT A'3 nT 






J 

crystallin, mu 


156 


A ^7 
U.J / 


A AQQ 


/1 1 1 O 1 

413ol_at 


A T> AAOQ A/C 

AB00z30o 


Ms.lUijl 




A A AO AO •M-vi-k'f A-i-n 

JsJAAU3Uo protem 


1 

1 J / 


A CT 

U.j/ 


A AQQ 

u.4yo 


lATX^ of 

j4/ io_ai 


AT7A<77^A 
ArUO/ /jU 


XlS.jJjU 


OjI/UZ 


1 JjO-aSSOCldlcQ 
















senne-arginine 
















proieui z 


1 

IDo 


A ^'7 
U.J/ 


u.4yo 


1Q/IOO of 

io4yz_ai 


T^<</iQO 

Lijjo3y 


Uo 1 <01 '30 

xis.ioyijy 


5y4Z 


Kyniireninase ^Jj- 
















kynurenine 
















hydrolase) 


1 CO 

15y 


A cn 

0.57 


A AC\n 

0.497 


iy43o_at 


ArUjyUol 


TT_ 1 1 'J 
Ms. 133 13 


i3oy 


cAJVLr responsive 
















element binding 
















protein-like 2 


160 


0.57 


0.497 


36997 at 


J04809 


Hs.76240 


203 


adenylate kinase 1 


161 


0.57 


0.497 


32076_at 


D83407 


Hs.156007 


10231 


Down syndrome 
















critical region gene 1- 
















likel 


162 


0.57 


0.497 


32185_at 


U00946 


Hs.184592 


65125 


protein kinase, lysine 
















deficient 1 
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163 0.57 0.496 36538_at 

164 0.56 0.496 41339_at 

165 0.56 0.495 32144_at 

166 0.56 0.495 37402_at 

167 0.56 0.494 700_s_at 

168 0.56 0.494 33521_at 

169 0.56 0.494 34934_at 

170 0.56 0.494 41018_at 

171 0.56 0.493 37539_at 

172 0.56 0.493 36626_at 

173 0.56 0.493 36012_at 

174 0.56 0.493 41491_s_at 

175 0.56 0.493 32746_at 

176 0.56 0.492 40833_r_at 

177 0.56 0.492 34256 at 



GB/TIGR UNIGENE LL_nuin 
Identifier (as of 
summer 
2001) 

AB018314 Hs.6162 23368 
AF043117 Hs.24594 10277 



AL050135 Hs.166891 5993 



D26129 Hs.78224 6035 



HG371- 
HT26388 

M63962 Hs.36992 495 



L29376 Hs.132807 

AL050015 Hs.92700 25864 

AB023176 Hs.79219 23179 

X87176 Hs.75441 3295 



Y09631 Hs.43913 10464 
AB028944 Hs.29189 23250 

AF015451 Hs.195175 8837 



AL050126 Hs.234265 26092 
AB018356 Hs.225939 8869 



Desc 

(unigene/locuslink or 
afiy) 

KIAA0771 protein 
ubiquitination factor 
E4B (homologous to 
yeast UFD2) 
regulatory factor X, 5 
(influences HLA 
class n expression) 
ribonuclease, RNase 
A family, 1 
(pancreatic) 
Mucin 1, Epiflielial, 
Alt. Splice 9 
ATPase, H+/K+ 
exchanging, alpha 
polypeptide 
(clone 3.8-1) MHC 
class I 

DKFZP5640243 

protein 

RalGDS-Uke g^e; 
KIAA0959 protein 
hydroxysteroid (17- 
beta) dehydrogenase 
4 

PIBFl gene product 
ATPase, Class VI, 
type llA 

CASPSandFADD- 
like apoptosis 
regulator 
DKFZP586G011 
protein 

sialyltransferase 9 
(CMP- 

NeuAc:lactosylceram 
ide alpha-2,3- 
sialyltransferase; 
GM3 synthase) 
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178 0.56 0.491 AFFX- L38424 
DapX-M_at 



UNIGENE LL_num 
(as of 
Slimmer 
2001) 



179 0.55 0.491 40547_at AI688516 Hs.l63867 4695 



180 0.55 0.491 41488_at 

181 0.55 0.49r 41501_at 



AC002394 Hs.144852 
AF004849 Hs.30148 10114 



182 0.55 0.490 35287_at AF046888 Hs.54673 8741 



183 
184 


0.55 
0.55 


0.490 
0.490 


33284_at 
40152_r_at 


M19507 
Z48054 


Hs.1817 
Hs.158084 


4353 
5830 


185 


0.55 


0.490 


34001_at 


AF033199 


Hs.8198 


7754 


186 
187 


0.55 
0.55 


0.489 
0.489 


1527 s at 
34141_at 


U50527 
AL109681 


Hs.22174 
Hs.226017 




188 


0.55 


0.489 


34116_at 


AF038852 


Hs.21903 


785 


189 


0.55 


0.488 


36806_at 


X83877 


Hs.289104 


11256 


190 
191 


0.55 
0.55 


0.488 
0.487 


39557_at 
40595_at 


AI625844 
AB45337 


Hs.295963 
Hs.301266 


6949 


192 


0.55 


0.487 


39993_at 


D11466 


Hs.51 


5277 


193 


0.55 


0.487 


39947_at 


AJ006352 


Hs.42331 


1945 



Desc 

(unigene/locuslink or 
affy) 

Bsubtilis dapBjojF, 
jojG genes 
corresponding to 
nucleotides 1358- 
3197 ofL38424(-5,- 
M, -3 represent 
transcript regions 5 
prime. Middle, and 3 
prime respectively) 
NADH 

dehydrogenase 
(ubiquinone) 1 alpha 
subcomplex, 2 (8kD, 
B8) 

hypothetical protein 
A-211C6.1 
homeodomain- 
interacting protein 
kinases 

tumor necrosis factor 
(Ugand) superfamily, 
member 13 
myeloperoxidase 
peroxisome receptor 
1 

zinc finger protein 
204 

BRCA2 region 
clone EUROIMAGE 
112333 

calcium channel, 
voltage-dependent, 
beta 4 subunit 
Alu-binding protein 
with zinc finger 
domain 
cDNA, 3 end 
Treacher Colhns- 
Franceschetti 
syndrome 1 
phosphatidylinositol 
glycan, class A 
(paroxysmal 
noctumal 
hemoglobinuria) 
ephrin-A4 
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0.1% 


St 


Identifier 


(as of 
summer 
2001) 




(unigene/locusliiik or 
afiy) 


194 


0.55 


0.487 


785_at 


U96114 


Hs.315493 


11060 


Nedd-4-like 

ubiquitin-protein 

ligase 


195 


0.55 


0.487 


33569_at 


D50532 


Hs.54403 


10462 


macrophage lectin 2 
(calcium dependent) 


196 


0.54 


0.486 


39171_at 


W21787 


Hs.99816 


56998 


beta-catenin- 
interacting protein 
ICAT 


197 


0.54 


0.486 


39678_at 


D10511 






acetyl-Coenzyme A 
acetyltransferase 1 
(acetoacetyl 
Coenzyme A 
thiolase) 


198 


0.54 


0.486 


881_at 


M35198 


Hs.123125 


3694 


integrin, beta 6 


199 


0.54 


0.485 


40064_at 


AB011121 


Hs.154248 


66008 


amyotrophic lateral 
sclerosis 2 (juvenile) 
chromosome region, 
candidate 3 


200 


0.54 


0.485 


33800_at 


AF036927 


Hs.20196 


115 


adenylate cyclase 9 



Table 5; Normal Lung Markers 

[00135] According to the invention, preferred markeis are markers 1 -30, preferably 1 - 
20, and more preferably 1-10. Highly preferred markers are transforming growth factor beta 
receptor II, dihydropyrimidinase-like 2, and tetranectin. 
Class Norm 

s2n obs Perm non norm list GB/TIGR UNIGENE LL nu Desc 



0.1% 

1 1.97 0.677 32542_at 

2 1.85 0.631 1815_g_at 



3 1.82 0.626 36119_at 

4 1.75 0.603 35868 at 



1.71 0.600 39031 at 



Identifier (as of m 
summer 
2001) 

AF063002 Hs.239069 2273 
D50683 Hs.82028 7048 



AF070648 Hs.74034 
M91211 Hs.184 



177 



AA15240 Hs.114346 1346 
6 



(unigene/locuslink or 
aflfy) 

foxir and a half LIM 
domains 1 

transforming growth 
factor, beta receptor n 
(70-80kD) 
clone 24651 
advanced 
glycosylation end 
product-specific 
receptor 

cytochrome c oxidase 
subunit Vila 
polypeptide 1 (muscle) 
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6 1.7 0.594 37398_at 

7 1.7 0.592 40331_at 

8 1.7 0.589 40607_at 

9 1.7 0.588 40841_at 

10 1.69 0.587 38454 g at 

11 1.65 0.582 36569_at 

12 1.63 0.578 39066_at 

13 1.6 0.576 40282_s_at 

14 1.6 0.575 34320_at 

15 1.6 0.574 37027_at 



16 1.58 

17 1.58 

18 1.57 

19 1.55 



0.574 33328_at 

0.573 35985_at 

0.572 770_at 

0.570 38177 at 



20 1.54 0.568 39760 at 



21 1.54 0.567 268_at 



22 1.53 0.567 33756_at 



GB/TIGR UNIGENE LLnu 
Identifier (as of m 
summer 
2001) 

AA10096 Hs.78146 5175 
1 

AF035819 Hs.67726 8685 



U97105 Hs.173381 1808 
AF049910 Hs.173159 6867 

X15606 Hs.83733 3384 
X64559 Hs.65424 7123 

L38486 Hs.296049 4239 
M84526 Hs.155597 1675 
AL050224 Hs.29759 22939 

M80899 Hs.301417 195 

W28612 Hs.296326 
AB023137 Hs.42322 11217 

D00632 Hs.336920 2878 

AJ001015 Hs.155106 10266 

AL031781 Hs.15020 9444 
L34657 

U39447 Hs.198241 8639 



PCT/US02/30797 

Desc 

(unigene/locuslink or 

affy) 

platelet/endothelial 
cell adhesion molecule 
(CD31 antigen) 
macrophage receptor 
with collagenous 
structure 

dihydropyrimidinase- 
like2 

transfonning, acidic 
coiled-coil containing 
protein 1 

intercellular adhesion 
molecule 2 
tetranectin 

(plasminogen-binding 

protein) 

microfibrillar- 

associated protein 4 

D component of 

complement (adipsin) 

polymerase I and 

transcript release 

factor 

AHNAK 

nucleoprotein 

(desmoyokin) 

cDNA 

A kinase (PRKA) 
anchor protein 2 
glutathione peroxidase 
3 (plasma) 
receptor (calcitonin) 
activity modifying 
protein 2 

homolog of mouse 
quaking QKI(KH 
domain RNA binding 
protein) 

platelet/endothelial 
cell adhesion molecule 
(CD31 antigen) 
amine oxidase, copper 
containing 3 (vascular 
adhesion protein 1) 
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GB/TIGR 


UNIGENE 
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Desc 






0.1% 




liiciiuiiGr 


(as of 
summer 
2001) 


TVI 
111 


^^uiugene/iocusiiiiK or 
affy) 


23 


1.51 


0.567 


32562 at 




no. / u / JO 


2022 


Rendu-Weber 

oyiiUlUIIlC 1 ) 


24 


1 51 


0 566 


40419_at 


X85116 


Hs.160483 


2040 


erythrocyte membrane 

piUlCLn UoLLU. 1 

(stomatin) 


25 


1.48 


0.565 


40Q94 9t 




Rc 9 1 1 'sfkQ 




yj proi6m"Coupi6a 
receptor kinase 5 


26 


1.48 


0.564 


384*^0 at 


9 






laixy acia Dmoing 
protein 4, adipocyte 


27 


1.47 


0.564 


361 5S at 




JjLS. l*rJOD 




/D gene 

product 


28 


1 47 


0 564 


39631_at 


U52100 


Hs.29191 


2013 


epithelial membrane 
proicm z 


29 


1.45 


0.563 


36627_at 


X86693 


Hs.75445 


8404 


SPARC-like 1 (mast9, 
hevin) 


30 


J. .*T«^ 




35730_at 


X03350 


Hs.4 


125 


alcohol dehydrogenase 
z ^ciass i)p Dcia 
polypeptide 


31 


1.42 


0.561 


34708_at 


D88587 


Hs.333383 


8547 


ficolin 

(coUagen/fibrinogen 
domain-containing) 3 
(Hakata antigen) 


32 


1 42 


0 560 


39775_at 


X54486 


Hs.151242 


710 


serine (or cysteine) 
proiemasc mniDiiory 
ciade u (Ui uxmbitor), 
member 1 




1.41 


0.560 


Do^oy EI 




ils.io/oz 




cDJNA, 3 end 


34 


1.41 


0.559 


35261_at 


W07033 


Hs.5210 


9535 


glia maturation factor, 

gamma 


35 


1.4 


0.559 


39350 at 


U50410 


Hs.l 19651 


2719 


glypican 3 


36 






40560_at 


U28049 


Hs. 168357 


6909 


T-box 2 


J / 


1.39 


0.559 


fiCYI c at 


IVl 1 KjdZ 1 


Tjo 1 1 ncno 

XlS.l lUoUZ 


/4jU 


von Willebrand factor 


38 




0 SS7 


1596 g at 


L06139 


Hs.89640 


7010 


TEK tyrosine kinase, 
enaomeuai \^venous 
malformations, 
multiple cutaneous and 
mucosal) 


39 




0 SS7 


38653_at 


D11428 


Hs. 103724 


5376 


peripheral myelin 
proiem zz 


40 


1.35 


0.557 


36577 at 


Z24725 


Hs 75260 


10979 

XV/ -7 / Zf 


mitncrpn inHnriKIp 7 


41 


1.33 


0.555 


37976~at 


AL034397 


Hs.8904 


11326 


Ig superfamily protein 


42 


1.33 


0.554 


34210_at 


N90866 


Hs.276770 


1043 


CDW52 antigen 

(CAMPATH-l 

antigen) 


43 


1.33 


0.554 


38508_s_at 


U89337 


Hs.169886 


7148 


DIRl protein 
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44 1.32 0.553 32780_at 

45 1.31 0.553 39634_at 

46 1.31 0.552 38995_at 



47 1.3 0.552 37099_at 



48 1.3 0.552 37196_at 



49 1.29 0.552 36958_at 

50 1.28 0.552 38685_at 

51 1.28 0.551 37307_at 



52 1.27 0.551 38704_at 



53 1.27 

54 1.26 



0.551 32166_at 
0.550 34874 at 



GB/TIGR UNIGENE LL_nu 

Identifier (as of m 
suimner 
2001) 

AB018271 Hs. 198689 26029 

AB017168 Hs.29802 9353 

AF000959 Hs.l 10903 7122 



AI806222 Hs.100194 241 



X79981 Hs.76206 1003 



X95735 Hs.75873 7791 
AL035306 Hs.l06823 84295 

X04828 Hs.77269 2771 



AB007934 Hs.l08258 23499 



AB028950 Hs.l8420 7094 
AJ004832 Hs.5038 10908 



55 1.26 0.549 36937_s_at U90878 Hs.75807 9124 



56 1.25 

57 1.25 

58 1.25 



0.549 37247_at 
0.549 39541_at 
0.547 590 at 



59 1.24 0.547 37168_at 



60 1.23 0.547 39038_at 

61 1.23 0.547 40456_at 

62 1.23 0.546 40202_at 



AF047419 Hs.78061 6943 

W52003 Hs.10491 57493 
M32334 

AB013924 Hs.10887 27074 



AF093118 Hs.11494 10516 
AL049963 Hs.284205 64116 

D31716 Hs.150557 687 



Desc 

(unigene/locuslink or 
afiy) 

KIAA0728 protein 

slit (Drosophila) 

homolog 2 

claudin 5 

(transmembrane 

protein deleted in 

velocardiofacial 

syndrome) 

arachidonate 5- 

lipoxygenase- 

activating protein 

cadherin 5, type 2, 

VE-cadherin (vascular 

epithelium) 

zyxin 

hypothetical protein 
MGC14797 
guanine nucleotide 
binding protein (G 
protein), alpha 
inhibiting activity 
polypeptide 2 
actin binding protein; 
macrophin 
(microfilament and 
actin filament cxoss- 
linker protein) 
KIAA1027 protein 
neuropathy target 
esterase 

PDZ and LIM domain 
1 (elfin) 

transcription factor 21 
KIAA1237 protein 
intercellular adhesion 
molecule 2 
similar to lysosome- 
associated membrane 
glycoprotein 
fibulin 5 

up-regulated by BCG- 
CWS 

basic transcription 
element binding 
protein 1 
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GB/TIGR 


UNIGENE 
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Desc 






0.1% 




XQciiuiicr 


(as of 
suimner 
2001) 


m 


^umgene/iowusiuiK or 
afify) 




1.21 


0.546 


aio^/c fit 


77AA8n 


xlS.iDlUHl 


ZD 13 


giycoproLeiii i\ 

repetitions 

predominaDt 


64 


1.2 


0.545 


32321_at 


X56841 


Hs.181392 


3133 


major 

histocompatibility 

complex, CloSS 1, X2i 


65 


1.19 


0.545 


37042_at 


U09577 


Hs.76873 


8692 


hyaluronoglucosanoini 
dase2 




1 10 




1RQ7 at 
lo^ / aX 




TTc 700^0 
xis. fyxjjy 




iransionmiig growtn 
factor, beta receptor in 
(betaglycan, 300kD) 


67 


1 1 Q 
l.io 








TTc 66708 
XIS.UO / Wo 


y^**i 


vesic le^oss o ci aieu 
membrane protein 3 
(cellubrevin) 


68 


1.17 


0.544 


32052 at 


L48215 


Hs. 155376 


3043 


hemoglobin, beta 


69 


IT? 




33862_at 


AF017786 


Hs.173717 


8613 


phosphatidic acid 
phosphatase type 2B 


70 


1 16 




32812_at 


AB029025 


Hs.202949 


22998 


KIAAl 102 protein 


71 


1.16 


0.543 


jOm-JX ax 




xlS. j:5U / 


1 l^JH-O 


synapiop oom 


72 


1 IS 




37407_s_at 


AF013570 


Hs.78344 


4629 


myosin, heavy 
poiypepiiue 1 1 , 
smooth muscle 


I o 


1.15 


0.541 


-2 0^.06 f at 


A TO n7 8 AO 


ris.oz /Z 




prosiagianam d/, 
synthase (21kD, brain) 


lA 


1 izl 


n 

U.^*rl 


916 at 
^lu ax 








prOSlaglaUCUn JJZ 

sj^thase (21kD, brain) 


75 




U041 


38700_at 


M33146 


Hs.108080 


1465 


cysteine and glycine- 
rich protein 1 


76 


1.13 


0.541 


39182_at 


U87947 


Hs.9999 


2014 


epithehal membrane 
protein 3 


77 


1.13 


0.541 








OCA 


angiopoieim i 


78 




ft <54.ft 


36207_at 


D67029 


Hs.75232 


6397 


SEC14 (S. cerevisiae)- 

iiKe 1 


79 


1.13 


0.540 


38338_at 


AI201108 


Hs.9651 


6237 


related RAS viral (r- 
ras) oncogene 
homolog 




111 
i.ii 




'^$/^01 c of 




We 1 an A 


^AAO 


suriacxani, pumionary- 
associated protein C 


81 


1.11 


0.539 


32109_at 


AA52454 
7 


Hs.160318 


5348 


FXYD domain- 

transport regulator 1 
(phospholemman) 


82 


1.11 


0.539 


38044 at 


AF035283 


Hs.8022 


11170 


TU3 A protein 


83 


1.1 


0.537 


40567_at 


X01703 


Hs.272897 


7846 


Tubulin, alpha, brain- 
specific 



82 
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GB/TIGR 


tMGENE 


LL_mi 


Desc 






0.1% 




Identifier 


(as of 


m 


(unigene/locuslink or 












summer 




afify) 












2001) 




mamiose receptor, C 


84 


1.1 


0.537 


36908_at 


M93221 




















type 1 


85 


1.1 


0.537 


35183_at 


U78735 


Hs.26630 


21 


ATP-bmding cassette, 
















sub-fanuly A (ABCl), 
















member 3 


86 


1.09 


0.537 


538_at 


S53911 


Hs.85289 


947 


CD34 antigen 


87 


1.09 


0.536 


33283 at 


AFl 06941 


Hs.18142 


409 


airestin, beta 2 


88 


1.08 


0.536 


33295 at 


X85785 


Hs.183 


2532 


Duffy blood group 


89 


1.08 


0.536 


38972 at 


AF052169 


Hs. 109438 




clone 24775 


90 


1.07 


0.536 


33137_at 


Y13622 


Hs.85087 


8425 


latent transforming 
















growth factor beta 
















binding protein 4 


91 


1.07 


0.535 


39588_at 


AF055872 


Hs.26401 


8742 


tumor necrosis factor 
















(ligand) superfamily. 
















member 12 


92 


1.06 


0.535 


38786_at 


AL079279 


Hs.8963 




clone EUROIMAGE 
















248114 


93 


1.06 


0.535 


33833_at 


J05243 


Hs.77196 


6709 


spectrin, alpha, non- 
















erythrocytic 1 (alpha- 
















fodrin) 


94 


1.06 


0.534 


35164_at 


AF084481 


Hs.26077 


7466 


Wolfram syndrome 1 
















(wolframin) 


95 


1.05 


0.534 


37718 at 


D43636 


Hs.79025 


23182 


KIAA0096 protein 


96 


1.05 


0.534 


1780_at 


M19722 


Hs.1422 


2268 


Gardner-Rasheed 
















felme sarcoma vnral 
















(v-fgr) oncogene 
















homolog 


97 


1.05 


0.534 


36668_at 


M28713 






diaphorase (NADH) 
















(cytochrome b-5 
















reductase) 


98 


1.05 


0.534 


41338_at 


AI951946 


Hs.21907 


11143 


histone 
















acetyltransferase 


99 


1.04 


0.533 


32527 at 


AI381790 


Hs.74120 


10974 


adipose specific 2 


100 


1.04 


0.533 


34363_at 


Z11793 


Hs.3314 


6414 


selenoprotein P, 
















plasma, 1 


101 


1.04 


0.533 


37743_at 


U60060 


Hs.79226 


9638 


fasciculation and 
















elongation protein zeta 
















1 (zygm I) 


102 


1.03 


0.533 


32838_at 


S67247 


Hs.296842 




smooth muscle myosin 
















heavy chain isoform 
















SMemb [human. 
















umbilical cord, fetal 
















aorta. 


103 


1.03 


0.533 


40739 at 


M83670 


Hs.89485 


762 


carbonic anhydrase IV 


104 


1.03 


0.533 


39057 at 


L04733 


Hs.l 17977 


3831 


kinesin2(60-70kD) 


105 


1.03 


0.532 


35625 at 


X94630 


Hs.3107 


976 


CD97 antigen 
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J>XwXA XXUXXXX- XXO \r 


GB/TTGR 

\SjLjI X XVJXv 




T T Till 








0 1% 




THentifipr 

XUwXX liXXX wX 




Tin 


^Uiligdic/ lvlwUollll&. Ux 
















any; 


23 
















1.51 


0.567 


yis&i at 


X72012 


Hs 76753 


2022 


















XvvlXVXU'' vv cuci 
















QVnHfATTlP 1 '\ 
oyXlUx^Jxlw X J 


24 


1.51 


0.566 


40419 at 


X85116 


Hs 160483 


2040 


PFVtTlTOOVfP ■mPTTiHT^inp 
















■nrntpin VifinH 7 9 


25 














^SlOIUaLlIi; 


1.48 


0.565 


40994 at 


L15388 


21 1569 


9869 




















26 


1 4R 

1 .*to 




38430 at 


AA19824 




9167 




27 








Q 






proicin *t, aaipocyie 


1.47 


0 564 


36155 at 


D87465 


XXO. / *T./0^ 


9806 


l^TAAn97S apTip 


28 














piUU-uL/l 


1.47 


0.564 


39631 at 


U52100 


Hs 29191 


2013 


pfiitViplifll ttipttiHtjitip 


29 


1.45 












■niTitpiTi 9 


0.563 


36627 at 










30 














nPvrn^ 

Xlw V XLl 1 


1 45 


0 569 


35730 at 


X03350 


H<? 4 


195 


















z ^ciass i^, D6ia 


31 














poiypepuue 


1.42 


0.561 


34708 at 


D88587 


XXO. .J .17 J O ^ 




liUUiiXl 
















( pnllflO'p'n/'filTrinrvO'pri 








i 








domain-containing) 3 


32 














(Hakata antigen) 


1.42 


0.560 


39775_at 


X54486 


Hs 151242 

X. XtJ« X <k/ X A« i^^fa* 


710 


cerinp ( cw p.vQfpinp^ 

OVlXXXw |vrX ^VdWwXXX^f 
















pnjl.CJXJla.aC ilUUUllUl^ 
















ciaae Li ^ui innioitor;, 
















member 1 


33 


1 41 


0 560 






ris.io/oz 




CJJJNA, 3 end 


34 


1 41 




35261 at 




ttq S9in 




giia maiurauon lacior. 


35 














gaUlLlIa 


1.4 


0.559 


39350 at 




xxo. 1 xyuj 1 


971 0 


giypiCdU .J 


36 






40560_at 


U28049 


Hs 168357 


6909 


T-box 2 


37 


1.39 


0.559 


607 <! at 

\J\J 1 O ClL 








von VY iiieDr ana lacior 


38 


1 36 


0 557 


1596 e at 


L06139 


Hs 89640 

X XO . U .17 


7010 

/ V/XV/ 


J. i-#JV ijrxUolllC JvlIiaoCy 
















cnuuiucuai ^venous 
















malformations. 
















muiiipic cuianeous ana 
















mucosal ) 


39 


1.36 


0 557 


38653 at 


D11428 


Hs.103724 


5^76 


pCJLipiidal UljrCXUx 


40 














■n"rr\f*=»in 99 
piuicxii 


1.35 


0.557 


36577 at 


Z24725 


Hs.75260 


10979 


mitogen inducible 9 

XXXXL^gWXl XXXVXLXvXLylw ^ 


41 


1.33 


0.555 


37976 at 


AL034397 Hs.8904 


11326 


Ig superfamily protein 


42 


1.33 


0.554 


34210_at 


N90866 


Hs.276770 


1043 


CDW52 antigen 
















(CAMPATH-1 


43 














antigen) 


1.33 


0.554 


38508_s_at 


U89337 


Hs.169886 


7148 


DlRl protein 
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44 


1.32 


0.553 


32780 at 


AB018271 Hs.198689 


26029 


45 


1.31 


0.553 


39634_at 


AB017168 Hs.29802 


9353 


46 


1.31 


0.552 


38995_at 


AF000959 Hs.l 10903 


7122 


47 


1.3 


0.552 


37099_at 


AI806222 Hs.100194 


241 


48 


1.3 


0.552 


37196_at 


X79981 Hs.76206 


1003 


49 


1.29 


0.552 


36958 at 


X95735 Hs.75873 


7791 


50 


1.28 


0.552 


38685_at 


AL035306 Hs.l06823 


84295 


51 


1.28 


0.551 


37307_at 


X04828 Hs.77269 


2771 


52 


1.27 


0.551 


38704_at 


AB007934 Hs.l08258 


23499 


53 


1.27 


0.551 


32166_at 


AB028950 Hs.l 8420 


7094 


54 


1.26 


0.550 


34874_at 


AJ004832 Hs.5038 


10908 


55 


1.26 


0.549 


36937_s_at 


U90878 Hs.75807 


9124 


56 


1.25 


0.549 


37247 at 


AF047419 Hs.78061 


6943 


57 


1.25 


0.549 


39541_at 


W52003 Hs.10491 


57493 


58 


1.25 


0.547 


590_at 


M32334 




59 


1.24 


0.547 


37168_at 


AB013924 Hs.10887 


27074 


60 


1.23 


0.547 


39038 at 


AF093118 Hs.11494 


10516 


61 


1.23 


0.547 


40456_at 


AL049963 Hs.284205 


64116 


62 


1.23 


0.546 


40202_at 


D31716 Hs.150557 


687 



Desc 

(unigene/locuslink or 
affy) 

KIAA0728 protein 

slit (Drosophila) 

homolog 2 

claudinS 

(transmembrane 

protein deleted in 

velocardiofacial 

syndrome) 

arachidonate 5- 

lipoxygenase- 

activating protein 

cadherin 5, type 2, 

VE-cadherin (vascular 

epithelium) 

zyxin 

hypothetical protein 
MGC14797 
guanine nucleotide 
binding protein (G 
protein), alpha 
inhibiting activity 
polypeptide 2 
actin binding protein; 
macrophin 
(microjaiament and 
actin filament cross- 
linker protein) 
KIAA1027 protein 
neuropathy target 
esterase 

PDZ and LIM domain 
1 (elfin) 

transcription factor 21 
KIAA1237 protein 
intercellular adhesion 
molecule 2 
similar to lysosome- 
associated membrane 
glycoprotein 
fibulin 5 

up-regulated by BCG- 
CWS 

basic transcription 
element binding 
protein 1 



81 

BNSDCXJID: <W0 03029273A2_I_> 



wo 03/029273 



PCT/US02/30797 





s2n_obs Penn 


non_nonn_list 


GB/TIGR 


UNIGENE 


LL_nu 


Desc 






0.1% 






IOC rtT 

summer 
2001) 


Jul 


AiniO'f^'n^/lrtPiicli'nl^ rw 

^luugwiic/ lui/UbiuiA^ or 
afify) 


63 


1.21 


0.546 


31856 at 


724680 


TT<a 1^1641 


2615 


repetitions 
preaomiTiani 


64 


1.2 


0.545 


32321_at 


X56841 


Hs.181392 


3133 


major 

histocompatibility 

f'nm'nlpY ^*ln<i^ T P 


65 


1.19 


0.545 


37042_at 


U09577 


Hs.76873 


8692 


hyaluronoglucosamiiii 

dase2 


66 


1 1Q 


0 ^4S 


1897 at 


T 07^94 


TT<! 7Q05Q 

XjLo. / 


7049 


lacior, oexa recepior in 
(betaglycan, 3001cD) 


67 


i . iO 




35783 at 




Hq 66708 


Q341 


Vl=*cif*l P-51 Qcnpi 5ltpH 

memDrane proiem ^ 
^ceiiuDre vm ^ 


68 


1.17 


0.544 


32052 at 


L48215 


Hs. 155376 


3043 


hemoglobin, beta 


69 


1 17 




33862_at 


AF017786 


Hs.173717 


8613 


phosphatidic acid 
phosphatase type 2B 


70 


1 16 


0 543 


32812_at 


AB029025 


Hs.202949 


22998 


KIAA1102 protein 


71 


1.16 


0.543 


"^64^2 at 






1 1 JHO 


syndp Lopouin 


72 


1.15 


0.542 


37407ls_at 


AF013570 


Hs.78344 


4629 


myosin, heavy 

"nrvl vn^^'n'riHR 1 1 

smooth muscle 


73 


1.15 


0.541 


38406 f at 


AT207842 


Hq 8272 




TITr^cfjl Crl QTlHlTI 

piUbLcl^iallLllll J^Zi 

synthase (21kD, brain) 


74 


1 14 


0 S41 


216 at 


MQ853Q 






tiTYicl'acylsi'n/ii'n T^O 
piUbla^idJIlUXil 1-9 £m 

synindse ^ziku, uraui^ 


75 


1 ^A 




38700_at 


M33146 


Hs.108080 


1465 


cysteine and glycine- 
ncn proiem i 


76 


1.13 


0.541 


39182_at 


U87947 


Hs.9999 


2014 


epithelial membrane 
protein 3 


77 


1.13 


0.541 


3931 S at 






984 


on <5i rir\/M i ti 1 
ailglUpUlCllIi 1 


78 


1 n 

J. • 1^ 


0 S40 


36207_at 


D67029 


Hs.75232 


6397 


SEC14 (S. cerevisiae)- 
iiirp 1 

IIKC 1 


79 


1.13 


0.540 


38338_at 


AI201108 


Hs.9651 


6237 


related RAS viral (r- 
ras) oncogene 
homolog 


80 


1 1 1 




38691 <! at 








oUliaUlalll, p UllUUIlary- 

associaxea proiem u 


81 


1.11 


0.539 


32109_at 


AA52454 
7 


Hs.160318 


5348 


FXYD domain- 

containinp ion 
transport regulator 1 
(phospholemman) 


82 


1.11 


0.539 


38044 at 


AF035283 


Hs.8022 


11170 


TU3 A protein 


83 


1.1 


0.537 


40567_at 


X01703 


Hs.272897 


7846 


Tubulin, alpha, brain- 
specific 



82 

BNSDOCID: <WO____03029273A2_I_> 
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s2n obs Peim 


non_nonn_list 


GB/nGR 






0.1% 




Identij5er 


84 


1.1 


0.537 


36908_at 


M93221 


85 


1.1 


0.537 


35183_at 


U78735 


86 


1.09 


0.537 


538_at 


S53911 


87 


1.09 


0.536 


33283 at 


AF106941 


88 


1.08 


0.536 


33295 at 


X85785 


89 


1.08 


0.536 


38972 at 


AF052169 


90 


l.U/ 


U.DjO 


33137_at 


Y13622 


91 


1.07 


0.535 


39588_at 


AF055872 




1.06 


0.535 


JO /oo^ai 


AT n7Q97Q 


93 


1.06 


0.535 


33833_at 


J05243 


94 


1.06 


0.534 


35164_at 


AF084481 


95 


1.05 


0.534 


37718 at 


D43636 


96 


1.05 


0.534 


1780_at 


M19722 


97 


1 




36668_at 


M28713 


98 


1.05 


0.534 


41338_at 


AI951946 


99 


1.04 


0.533 


32527 at 


AI381790 


100 


1.04 


0.533 


34363_at 


Z11793 


101 


1.04 


0.533 


37743_at 


U60060 


102 


1.03 


0.533 


32838_at 


S67247 


103 


1.03 


0.533 


1 40739_at 


M83670 



104 1.03 0.533 39057_at L04733 

105 1.03 0.532 35625_at X94630 



J 

PCT/US02/30797 



UNIGENE LL nu 


Desc 


(as of 


m 


(xioigen^ocuslmk or 


suiomer 




affy) 


2001) 




maiinose receptor, C 










typel 


Hs.26630 


21 


ATP-binding cassette. 






sub-family A (ABCl), 






member 3 


Hs.85289 


947 


CD34 antigen 


Hs.18142 


409 


arrestin, beta 2 


Hs.183 


2532 


Duffy blood group 


Hs.109438 




clone 24775 


Hs.85087 


8425 


latent transfomiing 






growth factor beta 






binding protein 4 


Hs.26401 


8742 


tumor necrosis factor 






(ligand) superfamily. 






member 12 


Hs.8963 




clone EUROMAGE 






248114 


Hs.77196 


6709 


spectrin, alpha, non- 






erythrocytic 1 (alpha- 






fodrin) 


Hs.26077 


7466 


Wolfram syndrome 1 






(wolframin) 


Hs.79025 


23182 


KIAA0096 protein 


Hs.1422 


2268 


Gardner-Rasheed 






feline sarcoma viral 






(v-fgr) oncogene 






homolog 






di^horase (NADH) 






(cytochrome b-5 






reductase) 


Hs.21907 


11143 


histone 






acetyltransferase 


Hs.74120 


10974 


adipose specific 2 


Hs.3314 


6414 


selenoprotein P, 






plasma, 1 


Hs.79226 


9638 


fasciculation and 



elongation protein zeta 
1 (zygin I) 

Hs.296842 smooth muscle myosin 

heavy chain isofonn 
SMemb [human, 
mnbUical cord, fetal 
aorta, 

Hs.89485 762 carbonic anhydrase IV 
Hs.l 17977 3831 kinesin 2 (60-70kD) 
Hs.3107 976 CD97 antigen 



0a02927aA2 l_> 
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PCT/US02/30797 





s2n_obs Perm 


non_nonn_list 


IjrrJ/lKjK 


UMlOJiJNr, 




Desc 






0 1% 




Id^tifier 


(as of 


m 


(uiugene/locuslisk or 












summer 




any; 












2001) 






106 

L\J\J 


1 0"? 




ACM AO at 


M16591 


Hs.89555 




Jicinopoieuw ceil 
















lOIlaSe 


1 AT 


1 0^ 




3o/17_at 


AL050159 Hs.288771 


OCO>l A 


DKFZP586A0522 
















protein 


1 VO 


1 A'5 


A ^11 


jZZJH at 


AL050223 


Hs.194534 




vesicie-associaiea 
















membrane protein 2 
















^synapiODrevm z ) 


1 no 


1 O'^ 


U.J J i 


jouzo_aL 


U01244 


Hs.79732 


01 QO 


nbulin 1 


110 
i lU 


1 09 




J lyoo^dx 


AL049257 Hs.8769 


o30U4 


hypothetical protein 
















UKrZip/olJl /121 


111 
ill 


1 A9 


A A 


3 /!>yb__at 


D79990 


Hs.80905 


9770 


Ras association 
















(KaKjDb/ At -o) 
















domain family 2 


1 1 o 


1 09 
i.UZ 


A ^^A 


'3 A 1 >l C 

jyi4!)_at 


J02854 


Hs.9615 


1 AO AO 

10398 


myosin regulatory 
















light chain 2, smooth 
















muscle isoform 




1 09 


0 S'^0 
U.J 


*f u / / J__al 


AL021786 Hs.17109 


O/l^O 
i74DZ 


mtegral membrane 
















— ^^^^^^ m0% ^1 A 

proiem zj\ 


11/1 


1 09 


O ^90 


i5z5z_r_ai 


M33680 


Hs.54457 


ATC 

975 


CD81 antigen (target 
















of antiproliferative 
















antibody 1) 


11^ 
1 ID 


1 AO 


A <OQ 




J02923 


Hs.76506 


3930 


lymphocyte cytosolic 
















protem 1 (L-plastm) 


11^ 
1 ID 


1 09 


0 S90 


3o/4o_at 


U76421 


Hs.85302 


1 A/1 

104 


adenosiae deaminase^ 
















KJN A-specinc, D 1 
















(homolog of rat 
















Kri/Ul ; 


117 


1 01 


0 ^90 


All QQ of 

4iiyo_at 


AF055008 


Hs.180577 


OOAiC 


granulin 


118 


1 


A ^95 


'2/110/1 of 


AL049313 


Hs.21103 




clone iJJsJ:*zpjo4i5U/o 


1 1 Q 

1 ly 


1 

I 


A ^9R 


331 jo__ai 


M97252 


Hs.89591 




Kallmann syndrome 1 
















sequence 


1 OA 


A OO 

u.yy 


A coo 


0 1 coc « «+ 

31!)z!>_s_at 


J00153 






. hemoglobin, alpha 2 


Izi 


0 QQ 


A ^99 


32847_at 


U48959 


Hs.211582 


4638 


myosin, light 
















polypeptide kinase 


1 99 


A QQ 


U.JZ / 


'^Ql 1 A of 


AF000652 


Hs.8180 


o3oO 


syndccan binding 
















protein (syntenin) 


123 


0.98 


0.527 


39220 at 


T92248 


Hs.2240 


7356 


uteroglobin 


124 


A OS 


A <07 


38119_at 


X12496 


Hs.81994 


2995 


glycophorin C 
















(Cjerbicn dIooq group) 


125 


0.98 


0.527 


400'}/: of 


AI651806 


Hs.19280 




cysLdiiC''nui motor 
















neuron 1 


126 


0.98 


0.527 


37194_at 


M68891 


Hs.334695 


2624 


GATA-binding protein 


127 


0.97 


0.526 


41620_at 


AB018259 Hs. 118140 


9732 


2 

KIAA0716 gene 
















product 
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s2n_obs Perm non_nonn_list 
0.1% 



128 0.96 

129 0.95 



0.526 3795 l_at 
0.526 657 at 



GB/TIGR UNIGENE LL_nu 
Identifier (as of m 
summer 
2001) 

AF035119 Hs.8700 10395 
LI 1373 Hs.284180 5098 



130 


0.95 


U.523 


37uuy at 


AL035079 Hs.76359 


847 


131 


0.95 


0.525 


33390_at 


AA20348 

7 


Hs.314363 




132 


0.95 


0.525 


40434 at 


U97519 


Hs.16426 


5420 


133 


0.95 


0.525 


37022_at 


U41344 






134 


0.95 


0.525 


31792_at 


M20560 


Hs.1378 


306 


135 


0.94 


0.524 


38113_at 


AB018339 Hs.8182 


23345 


136 


0.94 


0.524 


35152_at 


AJ001016 


Hs.25691 


10268 


137 


0.93 


0.524 


1879_at 


M14949 






138 


0.93 


0.524 


41734 at 


AB020677 Hs.l8166 




139 


0.92 


0.524 


36495_at 


U21931 






140 


0.92 


0.524 


1370 at 


M29696 


Hs.237868 


3575 


141 


0.92 


0.523 


1598_g_at 


L13720 


Hs.78501 


2621 


14Z 








W60864 


Hs.9963 


7305 


143 


0.92 


0.523 


32035_at 


M16942 


Hs.318720 




144 


0.92 


0.523 


41209_at 


M15856 


Hs. 180878 


4023 


145 


0.92 


0.523 


1612_s_at 


X56681 


Hs.2780 


3727 


146 


0.91 


0.523 


34091 s at 


Z19554 


Hs.297753 


7431 


147 


0.91 


0.522 


479_at 


U53446 


Hs.81988 


1601 


148 


0.91 


0.522 


3961 5_at 


AB028949 Hs.27742 


23254 


149 


0.9 


0.522 


692_s_at 


J02947 


Hs.2420 


6649 



150 0.9 

151 0.9 



0.521 36065_at 
0.521 40570 at 



AF052389 Hs.4980 9079 
AF032885 Hs.l70133 2308 



Desc 

(xmigeneAocuslink or 
affy) 

deleted in liver cancCT 
1 

protocadherin gamma 
subfamily C, 3 
catalase 
CD68 

podocalyxin-like 
proline arginine-rich 
end leucine-rich repeat 
protein 
annexin A3 
synaptic nuclei 
expressed gene lb 
receptor (calcitonin) 
activity modifying 
protein 3 

related RAS viral (r- 
ras) oncogene 
homolog 

KIAA0870 protein 
fructose-1,6- 
bisphosphatase 1 
interleukin 7 receptor 
growth arrest-specific 
6 

TYRO protein tyrosine 
kinase binding protein 
MHC class HHLA- 
DRw53-associated 
glycoprotein beta- 
chain 

lipoprotein lipase 
jun D proto-oncogene 
vimentin 

disabled (Drosophila) 
homolog 2 (mitogen- 
responsive 
phosphoprotein) 
KIAA1026 protein 
superoxide dismutase 
3, extracellular 
LIM domain binding 2 
forkhead box Ol A 
(rhabdomyosarcoma) 
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s2n obs Perm 


non_nonn_iist 




UINHjUlNJCJ 


T T mi 


Desc 






U.l /o 




laennncr 


(as of 


m 


(^umgene/ xocusuixk or 












summer 




any) 


















152 




A <01 


37148_at 


AF025533 


Hs.105928 


11025 


leukocyte 
















immunoglobulin-like 
















receptor, subfamily B 
















^wiui ijvi ana iiiivi 
















domams), member 3 


1 ^1 
ijj 




0 S91 

1 


412oo_at 


AL036744 Hs.279009 


42D0 


matrix Gla protein 


1 <A 




A 


jZo 1 i_al 


X98507 


Hs.286226 


4041 


myosm lo 


155 


U.OO 


A <01 


37384_at 


D13640 


Hs.278441 


9647 


KJAA0015 gene 
















product 


156 


A OO 

U.OO 


A COA 


41325_at 


AF006823 


Hs.24040 


3777 


. t 1 

potassium channel, 
















subfarmly K, member 
















3 (TASK) 


1 

ID / 


A OO 

U.OO 


A dA 


4Ui22_at 


D12763 


Hs.66 


91 /3 


interleukin 1 receptor- 
















like 1 


1 CO 

158 


U.OO 


A COA 


32905_s_at 


M30038 


Hs.334455 


7176 


tryptase, alpha 


159 


A on 
U.o/ 


A COA 

U.52U 


34873_at 


Y16241 


Hs.5025 


10529 


nebulette 


160 


U.O/ 


A ^OA 


610_at 


M15169 


Hs.2551 


154 


adrenergic, beta-2-, 
















receptor, surface 


lol 


A C? 

U.87 


A COA 

0.520 


41o44_at 


AB018333 Hs.12002 




t^T A A AT A A M«»^4-n-.-M 

KIAAU /yu protem 


162 


U.O / 


A ^OA 


36894_at 


AL031846 






chromobox homolog 7 


lo3 


A 

U.O / 


A <;7A 


33891_at 


AL080061 


Hs.25035 


25932 


chloride intracellular 
















channel 4 


lo4 


U.O / 


A ^9A 


4U147_at 


U18009 


Hs.157236 


1 A/l Al 

1U493 


membrane protein of 
















cholinergic synaptic 
















vesicles 


165 


A C7 
U.o/ 


A KO(\ 
U.DZU 


38796__at 


X03084 


Hs.8986 


713 


complement 
















component 1, q 
















subcomponent, beta 
















polypeptide 


loo 


A 07 

U.o/ 


A ^OA 


3o85o_at 


W28743 


Hs.7159 


OAI A1 

80301 


hypothetical protein 
















"DTI 1 /TO 0 

rr lo2o 


167 


U.57 


U.52U 


1038_s_at 


U19247 






mterferon gamma 
















receptor 1 


1 ^0 
lOO 


0.60 


A CI A 

0.5 ly 


34oi /_i_at 


M12963 


Hs.73843 


124 


alcohol dehydrogenase 
















1 (class I), alpha 
















polypeptide 


1 /^O 


0.85 


0.519 


3o/4/ at 


M81945 






L/LIJ4 antigen 


170 


0.84 


0.519 


32747_at 


X05409 


Hs.195432 


217 


aldehyde 
















dehydrogenase 2, 
















mitochondrial 


171 


0.84 


0.519 


32749_s_at 


AL050396 Hs.l95464 


2316 


filamin A, alpha 
















(actin-binding protein- 



280) 
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s2n_obs Penn nonjiorm_list 
0.1% 



172 0.84 0.519 38087_s_at 



GB/TIGR XJNIGENE LL^nu 
Identifier (as of m 
summer 
2001) 

W72186 Hs.81256 6275 



173 0.84 0.518 38095 i at M83664 Hs.814 3115 



174 0.84 

175 0.84 

176 0.83 

177 0.83 



0.518 40203_at 

0.518 34224_at 

0.518 307_at 

0.518 38968 at 



178 0.83 0.517 39114_at 



179 0.83 0.517 41385_at 



180 0.83 0.517 39400_^at 

181 0.83 0.517 39081_at 

182 0.82 0.517 33813_at 



183 0.82 0.517 31775_at 

184 0.82 0.517 32855_at 

185 0.82 0.516 40480_s_at 

186 0.81 0.516 36156_.at 

187 0.81 0.516 41439_at 

188 0.81 0.516 774 _^_at 



AJ012375 Hs.150580 10209 

AC004770 Hs.21765 3995 

J03600 Hs.89499 240 

AB005047 Hs.109150 9467 

AB022718 Hs.93675 11067 

AB023204 Hs.l03839 23136 



AB028978 Hs.l26084 23102 
AI547258 Hs.l 18786 4502 
AI813532 Hs.256278 7133 



X65018 
L00352 

M14333 Hs.169370 2534 

U41518 Hs.74602 358 

AJ001381 Hs.121576 

D10667 



Desc 

(unigene/locuslink or 
affy) 

S 1 00 calcium-binding 
protein A4 (calcium 
protein, calvasculin, 
metastasin, murine 
placental homolog) 
major 

histocompatibility 
complex, class H, DP 
betal 

putative translation 
initiation factor 
flap structure-specific 
endonuclease 1 
arachidonate 5- 
lipoxygenase 
SH3-domain binding 
protein 5 (BTK- 
associated) 
decidual protein 
induced by 
progesterone 
differentially 
expressed in 
adenocarcinoma of the 
lung 

KIAA1055 protein 
metallothionein 2A 
tumor necrosis factor 
receptor superfamily, 
member IB 
surfactant, puhnonary- 
associated protein D 
low density lipoprotein 
receptor (famihal 
hypercholesterolemia) 
FYN oncogene related 
to SRC, FOR, YES 
aquaporin 1 (channel- 
forming integral 
protein, 28kD) 
incomplete cDNA for 
a mutated allele of a 
myosin class I, myh-lc 
myosin, heavy 
polypeptide 11, 
smooth muscle 
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s2n_obs Perm noii_norm_list GB/TIGR UNIGENE LLjiu 
0.1% Identifier (as of m 

summer 
2001) 

189 0.81 0.516 924 s at J03805 Hs.80350 5516 



190 0.81 

191 0.81 



0.516 40771_at 
0.515 38833 at 



192 0.81 0.515 41143 at 



Z98946 
X00457 



U12022 



Hs. 170328 4478 
Hs.914 



193 0.8 0.515 37176_at U96078 Hs.75619 3373 

194 0.8 0.515 36447_at S80990 

195 0.8 0.515 1052_s_at M83667 Hs.76722 1052 

196 0.8 0.515 41723_s_at M32578 Hs.l80255 3123 

197 0.8 0.515 38404 at M55153 Hs.8265 7052 



198 0.8 0.515 34760_at 

199 0.79 0.515 32569_at 

200 0.79 0.514 505 at 



D14664 Hs.2441 9936 
L13385 Hs.77318 5048 

U43077 Hs. 160958 11140 



Desc 

(urdgene/locuslink or 
affy) 

protein phosphatase 2 

(formerly 2A), 

catalytic subunit, beta 

isoform 

moesin 

SB classn 

histocompatibility 

antigen alpha-chain 

calmodulin 1 

(phosphorylase kinase, 

delta) 

hyaluronoglucosamini 

dase 1 

ficolin 

(coUagen/fibrinogen 
domain-containing) 1 
CCAAT/enhancer 
binding protein 
(C/EBP), delta 
major 

histocompatibility 
complex, class E, DR 
betal 

transglutaminase 2 (C 
polypeptide, protein- 
glutamine-gamma- 
glutamyltransferase) 
KIAA0022 gene 
product 

platelet-activating 
factor acetylhydrolase, 
isoform lb, alpha 
subunit (45kD) 
CDC37 (cell division 
cycle 37, S. cerevisiae, 
homolog) 



Table 6: Colorectal Matastasis Markers 

[00136] According to the invention, preferred markers are markers 1-30, preferably 1- 
20, and more preferably 1-10. Highly preferred markers are cytokeratin 20 and vilUn 1. 
Class: Colon 
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s2n_obs 


Perm 


non_iiorm_ 


list GB/nGR 


UNIGENE LL_iiiun 


Desc 






0.1% 




Identifier 


(as of 




(unigene/lociislink 












summer 




oraffy) 












2001) 




■ 


1 


2.33 


0.914 


40392_at 


U51096 


Hs.77399 


1045 


caudal type homeo 
















box transcription 
















factor 2 


2 


1.58 


0.728 


40736_at 


X83228 


Hs.89436 


1015 


cadherin 17, LI 
















cadherin (liver- 
















intestine) 


3 


1.55 


0.719 


37124_i_at 


J04813 


Hs.104117 


1577 


cytochrome P450, 
















subfamily IHA. 
















(niphedipine 
















oxidase), 
















polypeptide 5 


4 


1.52 


0.715 


169_at 


U51095 


Hs.1545 


1044 


caudal type homeo 
















box transcription 
















&ctor 1 


5 


1.45 


0.701 


40043_at 


X71345 


Hs.58247 


5647 


protease, serine, 4 
















(trypsin 4, brain) 


6 


1.4 


0.698 


35644 at 


AB014598 Hs.31720 


9843 


hephaestin 


7 


1.37 


0.688 


38586_at 


M10050 


Hs.5241 


2168 


fetty acid binding 
















protein 1, liver 


8 


1.37 


0.682 


32972 at 


Z83819 


Hs. 132370 


27035 


NADPH oxidase 1 


9 


1.34 


0.679 


39951 at 


L20826 


Hs.430 


5357 


plastin 1 (lisofoim) 


10 


1.3 


0.677 


1229_at 


U78556 


Hs.166066 


10903 


cisplatin resistance 
















associated 


11 


1.3 


0.677 


988_at 


X16354 


Hs.50964 


634 


carcinoembryonic 
















antigen-related cell 
















adhesion molecule 
















1 (biliary 
















glycopiotein) 


12 


1.3 


0.669 


37415_at 


AB018258 Hs.109358 


23120 


ATPase, Class V, 
















typelOB 


13 


1.25 


0.668 


41708_at 


AB028957 Hs. 12896 


23314 


KIAA1034 protein 


14 


1.22 


0.656 


765_s_at 


AB006781 Hs.5302 


3960 


lectin, galactoside- 
















binding, soluble, 4 
















(galectin 4) 


15 


1.21 


0.654 


39697_at 


U26726 


Hs.1376 


3291 


hydroxysteroid (1 1- 
















beta) 
















dehydrogenase 2 


16 


1.2 


0.650 


33559_at 


U61412 






PTK6 protein 
















tyrosine kinase 6 


17 


1.2 


0.649 


33904 at 


AB000714 Hs.25640 


1365 


claudin 3 


18 


1.19 


0.649 


41266_at 


X53586 


Hs.227730 


3655 


integrin, alpha 6 


19 


1.19 


0.648 


36170_at 


D83198 


Hs.7486 


23474 


protein expressed in 
















thyroid 


20 


1.18 


0.648 


37847_at 


AB006955 Hs.l32945 


10083 


PDZ-73 protein 
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21 1.16 



22 1.16 

23 1.14 

24 1.14 



25 1.11 

26 1.11 

27 1.1 



28 1.08 



29 1.07 

30 1.07 

31 1.07 

32 1.07 



33 1.05 

34 1.05 

35 1.04 

36 1.03 

37 1.03 



Perm non_norin_Iist GB/TIGR UNIGENE LL_num 
0.1% Identifier (as of 

summer 
2001) 

AF105424 Hs.5394 4640 



0.646 34595 at 



0.644 
0.639 
0.638 



40694_at 
3541 5_at 
899 at 



0.638 37875 at 



0.635 
0.632 



41678_at 
32649 at 



0.629 35114 at 



0.629 36832 at 



0.627 
0.624 



41396_at 
35256 at 



0.620 33436 at 



X73502 Hs.84905 54474 
X12901 Hs. 166068 7429 
L38517 Hs.69351 3549 



U79725 Hs.143131 10223 

AF025304 Hs.l25124 2048 
X59871 Hs.169294 6932 



AF084645 Hs.l 18138 8856 



AB015630 Hs.69009 10331 

AB006629 Hs.l04717 7461 
AL096737 Hs.5167 

Z46629 Hs.2316 6662 



0.620 33789_at AF088219 Hs.272493 6359 

0.619 34450_at M73489 Hs.l085 2984 

0.619 31355_at U77629 Hs. 135639 430 

0.618 39732_at X73882 Hs. 146388 9053 

0.617 40061 at D83784 Hs. 154104 5326 



Desc 

(unigene/locuslink 
or affy) 

myosin, heavy 
polypeptide-like 
(llOkD) 
cytokeratin 20 
villin 1 

Indian hedgehog 

(Drosophila) 

homolog 

glycoprotein A33 

(transmembrane) 

EphB2 

transcription factor 
7 (T-cell specific, 
HMG-box) 
nuclear receptor 
subfamily 1, group 
I, member 2 
transmembrane 
protein 3 

cytoplasmic linker 2 
clone 

DKFZp434F152 

SRY (sex 

determining region 
Y)-box 9 
(campomelic 
dysplasia, 
autosomal sex- 
reversal) 
small inducible 
cytokine subfamily 
A (Cys-Cys), 
member 23 
guanylate cyclase 
2C (heat stable 
enterotoxin 
receptor) 
achaete-scute 
complex 
(Drosophila) 
homolog-like 2 
microtubule- 
associated protein 7 
pleiomorphic 
adenoma gene-like 
2 
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38 1.03 



39 1.03 



40 1.03 

41 1.02 



s2ii_obs Perm non.norm.Iist GB/TIGR UMGENE LL.num 
0.1% Identifier (as of 
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2001) 

0.617 38469 at M35252 Hs.84072 7103 



42 1.01 



43 1.01 



0.615 246 at 



M25629 Hs.123107 3816 



0.613 36742_at U34249 Hs.337461 89870 
0.613 36816 s at M28668 Hs.663 1080 



0.612 38495 s at U27328 Hs.l69238 2525 



0.611 1973 s at V00568 Hs.79070 4609 



44 


1.01 


0.611 


37857 at 


AL080188 


Hs. 137556 


92211 


45 


1 


0.610 


40198_at 


L06132 


Hs.149155 


7416 


46 


0.99 


0.607 


33824 at 


X74929 


Hs.242463 


3856 


47 


0.99 


0.607 


38160_at 


AF011333 


Hs.153563 


4065 


48 


0.99 


0.607 


34280_at 


Y09765 


Hs.22785 


2564 


49 


0.98 


0.606 


31608 g at 


AJ002428 


Hs.201553 


10065 


50 


0.98 


0.606 


820_at 


U77604 


Hs.81874 


4258 


51 


0.98 


0.606 


34176_at 


AF091087 


Hs.206501 


57228 


52 


0.98 


0.605 


40647_at 


Z32684 


Hs.78919 


7504 



Desc 

(unigene/locuslink 
or affy) 

transmembraae 4 
superfamily 
member 3 
kallikrein 1, 
renal/pancreas/saliv 
ary 

ring finger protein 9 
cystic fibrosis 
transmembrane 
conductance 
regulator, ATP- 
binding cassette 
(sub-family C, 
m^nber 7) 
fucosyltransferase 3 
(galactoside 3(4)-L- 
fiicosyltransferase, 
Lewis blood group 
included) 
v-myc avian 
myelocytomatosis 
viral oncogene 
homolog 

MT-protocadherin 
voltage-dependent 
anion channel 1 
keratin 8 

lymphocyte antigen 
75 

gamma- 

aminobutyric acid 
(GAB A) A 
receptor, epsilon 
voltage-dependent 
anion channel 1 
pseudogene 
microsomal 
glutathione S- 
transferase 2 
hypothetical protein 
from clone 643 
Kell blood group 
precursor (McLeod 
phenotype) 



91 



03029273A2 J > 



o 

wo 03/029273 



o 

PCT/US02/30797 



s2n_obs Perm non_norm_list GB/TIGR UNIGENE LL_num 
0.1% Identifier (as of 

summer 
2001) 

53 0.98 0.604 36655 at L27476 Hs.75608 9414 



54 0.97 



56 0.96 



57 0.96 

58 0.96 



59 0.95 



60 0.95 

61 0.95 



62 0.95 

63 0.94 

64 0.94 



65 0.94 

66 0.93 

67 0.93 



68 0.92 



0.604 37050 r at AI130910 Hs.76927 10953 



55 0.97 0.604 32324 at X57346 Hs.279920 7529 



0.604 41715 at Y11312 Hs,132463 5287 



0.604 40492_at 
0.603 575 s at 



0.603 39721_at 
0.602 34803_at 

0.602 32587 at 



0.602 41359_at 
0.602 1291_s_at 

0.602 37253 at 



0.601 38005 at 



AB020633 Hs.l69600 23045 
M93036 



0.603 1756 f at D00003 Hs.329704 1575 



0.603 37950_at X74496 Hs.86978 5550 
0.603 35489 at M82962 Hs.l79704 4224 



U09303 Hs.144700 1947 
AF022789 Hs.4240O 9959 

U07802 Hs.78909 678 



Z98265 Hs.26557 11187 
L03840 Hs.165950 2264 

X92493 Hs.78406 8395 



AJ005866 Hs.90078 11046 



Desc 

(unigene/locuslink 
or affy) 

tight junction 
protein 2 (zona 
occludens 2) 
translocase of outer 
mitochondrial 
membrane 34 
tyrosine 3- 
monooxygenase/try 
ptophan 5- 
monooxygenase 
activation protein, 
beta polypeptide 
phosphoinositide-3- 
kinase, class 2, beta 
polypeptide 
KIAA0826 protein 
tumor-associated 
calcium signal 
transducer 1 
cytochrome P450, 
subfamily niA 
(niphedipine 
oxidase), 
polypeptide 3 
prolyl 

endopeptidase 
meprin A, alpha 
(PABA peptide 
hydrolase) 
ephrin-Bl 
ubiquitin specific 
protease 12 
butyrate response 
factor 2 (EGF- 
response factor 2) 
plakophilin 3 
fibroblast growth 
factor receptor 4 
phosphatidylinositol 
-4-phosphate 5- 
kinase, type I, beta 
nucleotide-sugar 
transporter similar 
to C. elegans sqv-7 
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Perm 


non norm list GB/TIGR UNIGENE LL_num 


Desc 






A 1 OA 
U.JL /O 




Identifier (as of 




(unigene/locuslink 
or affy) 










1001 




even-skipped 
homeo box 1 
(homolog of 
Drosophila) 




n 00 


U.OUi 


41448_at 


AC004080 Hs.l 10637 


3206 




u.yi 


u.ouu 


39748_at 


AL050021 Hs.14846 




clone 

DKFZp564D016 


71 


0.91 


0.600 


35276_at 


AB000712 Hs.5372 


1364 


claudin 4 


72 


0.9 


0.599 


37244_at 


AA74635 Hs.77917 

C 

D 


7347 


ubiquitin carboxyl- 
terminal esterase L3 

AiViifluitin 
thiolesterase) 


73 


0.9 


0.599 






10449 


acetyl-Coenzyme A 
acyltransferase 2 
(mitochondrial 3- 
oxoacyl-Coenzyme 
A thiolase) 


74 


0.9 


0.598 








fiicosyltransferase 6 
f aloha f 1 3) 
fiicosyltransferase) 


75 


0.9 


0.598 


36846_s_at 


AA12150 Hs.70830 
9 


51690 


U6 snRNA- 
associated Sm-Uke 
protein LSm7 


/O 


A QQ 




35262_at 


AF022229 Hs.5215 


3692 


intesrin beta 4 
binding protein 


77 


0.89 


0.597 


41olo at 


AT n/IQQ<1 XIo ^'7Q7'2 
Al-.U4y oD i US.D / J 


Z7/ / J 


hypothetical protein 


78 


0.89 


0.597 


38739_at 


AF017257 Hs.85146 


2114 


V»f*tQ nvifiTi 

V*Vla dVlCUl 

erythroblastosis 
vini<5 K9fi oncogene 
homolog 2 


79 


0.89 


0.596 


1 n 

iyjo_s_at 


HT4899 




Proto-Oncogene C- 
Mvc Alt SDlice3 
Orfll4 


80 


0.89 


0.596 


1 1 OAS of 




6297 


ribosomal protein 
S21 


81 


0.88 


0.596 


36687_at 


N50520 Hs.75752 


1349 


cytochrome c 
oxidase submiit 

vnb 


82 


0.88 


0.595 


OA/fO e at 




4602 


v-myb avian 
myeloblastosis viral 
oncogene homolog 


83 


0.87 


0.595 


38375_at 


AFl 12219 Hs.82193 


2098 


esterase 

D/formylglutathion 
e hydrolase 


84 


0.86 


0.594 


35961_at 


AL049390 Hs.22689 




clone 

DKFZp58601318 
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85 0.86 



86 0.86 

87 0.86 



88 0.86 



89 0.86 



90 0.86 



91 0.86 

92 0.86 

93 0.85 



94 0.85 



95 0.85 



96 0,85 



97 0.84 

98 0.84 



Perm noDi_norin_list GB/TIGR UNIGENE LL.num 
0.1% Identifier (as of 

summer 
2001) 

0.594 1582 at M29540 Hs.220529 1048 



0.594 37888_at D87449 Hs.82635 23169 
0.594 266 s at L33930 Hs.286124 934 



0.593 31845 at U32645 Hs. 151 139 2000 



0.593 37211 at M93107 Hs.76893 622 



0.592 35345 at X83618 Hs.59889 3158 



0.592 
0.592 



41236_at 
37698 at 



0.591 32585 at 



0.590 38808 at 



0.590 1317 at 



U79252 Hs.240062 29787 
X97335 Hs.78921 8165 

AF027299 Hs.7857 2037 



D64154 Hs.90107 11047 



0.590 37104 at L40904 Hs.l00724 5468 



X70040 Hs.2942 4486 



0.590 37413_at J05257 Hs.l09 1800 
0.589 36345_g_at U34038 Hs.l54299 2150 



Desc 

(unigene/locuslink 
or affy) 

carcinoembryonic 
antigen-related cell 
adhesion molecule 

5 

KIAA0260 protein 
CD24 antigen 
(small cell lung 
carcinoma cluster 4 
antigen) 

E74-like factor 4 
(ets domain 
transcription factor) 
3-hydroxybutyrate 
dehydrogenase 
(heart, 

mitochondrial) 
3-hydroxy-3- 
metiiylglutaryl- 
Coenzyme A 
synthase 2 
(mitochondrial) 
hypothetical protein 
A kinase (PRKA) 
anchor protein 1 
erythrocyte 
membrane protein 
band4.1-hke2 
cell membrane 
glycoprotein, 
110000M(r) 
(surface antigen) 
peroxisome 
proliferative 
activated receptor, 
gamma 
macrophage 
stimulating 1 
receptor (c-met- 
related tyrosine 
kinase) 
dipeptidase 1 
(renal) 

coagulation factor n 

(thrombin) 
receptor-like 1 
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99 0.84 0.589 38036_at 



100 0.84 0.589 39765_at 

101 0.84 0.588 36363_at 



110 0.82 

111 0.82 



113 0.82 

114 0.82 

115 0.82 



GB/TIGR UNIGENE LL_nuiii 
Identifier (as of 
summer 
2001) 

L35035 Hs.79886 22934 



AB002318 Hs.150443 23079 
U30930 Hs.158540 7368 



102 


0.84 


0.587 


1031_at 


U09564 


Hs.75761 


6732 


103 


0.84 


0.587 


35913_at 


U88047 


Hs.198515 


1820 


104 


0.83 


0.587 


39119_s_at 


AA63197 

2 

AI474125 


Hs.943 


9235 


105 


0.83 


0.587 


37896_at 


Hs.82961 


7033 


106 
107 


0.83 
0.83 


0.587 
0.587 


33892_at 
1506_at 


X97675 
D11086 


Hs.25051 
Hs.84 


5318 
3561 



108 0.83 0.587 1237_at 

109 0.82 0.586 35194_at 



0.586 36650_at 
0.586 2075 s at 



S81914 HsJ6095 8870 
X53463 Hs.2704 2877 



D13639 Hs.75586 894 
L36719 Hs.180533 5606 



112 0.82 0.586 40182_s_at AF055027 Hs.l43696 10498 



0.586 786_at 
0.585 901_g_at 
0.585 41200 at 



X06745 Hs.267289 5422 
L41349 Hs.283006 5332 
Z22555 Hs.180616 949 



Desc 

(iinigene/locuslink 
or af!y) 

ribose 5-phosphate 
isomerase A (ribose 
5-phosphate 
epimerase) 
KIAA0320 protein 
UDP 

glycosyltransferase 
8 (UDP-galactose 
ceramide 

galactosyltransferas 
e) 

SFRS protein 
kinase 1 
dead ringer 
(Drosophila)-like 1 
natural killer cell 
transcript 4 
trefoil factor 3 
(intestinal) 
plakophilin 2 
interleukin 2 
receptor, gamma 
(severe combined 
immunodeficiency) 
immediate early 
response 3 

glutathione 
peroxidase 2 

(gastrointestinal) 

cyclin D2 

mitogen-activated 

protein kinase 

kinase 3 

coactivator- 

associated arginine 

methyltransferase-1 

polymerase (DNA 

directed), alpha 

phospholipase C, 

beta 4 

CD36 antigen 
(collagen type I 
receptor, 
thrombospondin 
receptor)-like 1 
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iion_ttoriii_usi 


GB/TIGR 


UMGENE LL num 


Desc 






A 1 0/ 




Identifier 


(as of 




(unigene/lociislmk 












summer 




or aiiyj 












2001) 






1 iU 




U.OoD 


"iOllQ of 


AB018335 Hs.119387 


9725 


ITT A A ATQO rroviA 

jsJA/vu/yz gene 
















product 


1 1 T 


U.ol 


A <C/I 


/ii'icc 

4lJ35_at 


rNyD22y 


JtlS.l^Uool 


DDDDD 


B-cell 
















v^i-yju/ lympiioiiia 
















11 A (zinc finger 
















protein) 


lie 
1 lo 


U.ol 


A <Q/I 

U.Do4 


^ AAAO ••. 

4U0U2_r_at 


A TAO e A 

AI935442 


Hs.53542 


2:523U 


chorein 


110 

1 ly 


n fii 
u.ol 


A <QA 


4U4U4__S_ai 


TT1 OA1 

Ulo291 


TT— 1 CAO 

Hs.1592 


0001 

oool 


UDL/lo (cell 
















division cycle 16, S. 
















cerevisiae, 
















homolog) 


120 


0.81 


A COO 

0.583 


40893_at 


AF058953 


Hs.182217 


OO AO 

8803 


succmate-CoA 
















ligase, ADP- 
















lonnuig, Deia 
















subumt 


121 


A O 

0-8 


0.583 


34840_at 


AI700633 


Hs.288232 




cDNA, 3 end 


lz2 


A O 
O.O 


A ^OO 


3ol23_at 


D87292 


Hs.248267 


/20J 


thiosulfate 
















sulfurtransferase 
















(rhodanese) 


IZJ 


A 0 
0.6 


A ^OO 


JJ24o_at 


H94842 


Hs. 17882 




EST 


1 

IZ'f 


U.O 


A ^CO 


o4ooo_ai 


AF055029 Hs.4988 




r«1j^<MA 0>IT1 1 

clone 24/11 




U.O 




J4Z3j_ai 


AF059202 


Hs.288627 


8694 


diacylglycerol O- 
















acyltransierase 
















(mouse) homolog 


Izo 


A Q 

U.O 


A coo 


371oo_s_at 


U11863 


Hs.75741 


OiC 

2o 


amiloride binding 
















protein 1 (amine 
















oxidase (copper- 
















containing)) 




A Q 
U.O 


A ^QO 


41 22 j_at 


M22760 


Hs.181028 


y5l 1 


c3^ochrome c 
















oxidase subunit Va 


IZo 


A 90 

u. /y 


A <^01 

U.Dol 


34J3^_ai 


AI765533 


Hs.30942 


1 Q/IC 

iy4o 


ephrin-B2 


izy 


A 7Q 


A ^Ql 


'^ATIO of 
/ lZ__al 


AB023227 Hs.23860 


23200 


TiTT A A 1 A1 A t-k-rvxfA^-n 

JsJAA 1 u 1 u protem 


1 jU 


A TO 

u. /y 


A <Q1 
U.Dol 


1 KA of 


U02388 


Hs.101 


OCOA 

8529 


cytochrome P450, 
















suDiamiiy IVi*, 
















polypeptide 2 


111 
lil 


A TO 

u. /y 


A COA 

U.JoU 


'2>100A «+ 

34o2y_at 


U59151 


Hs.4747 


1736 


dyskeratosis 
















congenita 1, 
















dyskerin 


liz 


A TA 


A COA 

U.580 


4U52 /_at 


AF000571 


Hs.156115 


J /o4 


potassium voltage- 
















galea cnaunei. 
















KQT-like 
















subfamily, member 


133 


0.79 


0.580 


37757_at 


L23959 


Hs.79353 


ion 


1 

transcription factor 
















Dp-1 
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134 0.79 

135 0.79 

136 0.78 

137 0.78 

138 0.78 

139 0.78 



140 0.78 

141 0.78 

142 0.77 



143 0.77 



144 0.77 



145 0.77 



146 0.77 



147 0.77 

148 0.77 

149 0.77 



0.578 39362_r_at 
0.578 37690 at 



0.577 35029_at 
0.577 31849_at 
0.577 40333 at 



GB/nGR 


UNIGENE LL num 


Desc 


Identifier 


(as of 




(nnigeneAocuslink 




summer 




or affy) 




2001) 






D14520 


Hs.84728 


688 


Kruppel-like factor 








5 (intestinal) 


D84110 


Hs.80248 


11030 


RNA-binding 








protein gene with 








multiple splicing 


U27193 


Hs.41688 


1850 


dual specificity 








phosphatase 8 


AB011540 Hs.4930 


4038 


low density 








lipoprotein 








receptor-related 








protein 4 


AL050139 Hs.75277 


64795 


hypothetical protein 








FIJ13910 


U55206 


Hs.78619 


8836 


gamma-glutamyl 








hydrolase 








(conjugase, 








folylpolygammaglut 








amyl hydrolase) 


U32315 


Hs.82240 


6809 


syntaxin 3 A 


Y07593 


Hs.79187 


1525 


coxsackie virus and 








adenovirus receptor 


AF059531 


Hs.152337 


10196 


protein arginine N- 








methyltransferase 








3(hnRNP 








methyltransferase S. 








cerevisiae)-like 3 


AC004523 Hs. 180570 


66002 


hypothetical protein 








similar to rat 








CYP4F1 


H06628 


Hs.199067 


2065 


v-erb-b2 avian 








erythroblastic 








leukemia viral 








oncogene homolog 
3 


AF0439O6 


; Hs.121068 


7105 


transmembrane 4 








superfamily 








member 6 


U61263 


Hs.78880 


10994 


ilvB (bacterial 








acetolactate 








synthase)-like 


Y07828 


Hs.91096 


11074 


ring finger protein 


AB011136 Hs.151385 


23078 


KIAA0564 protein 


U43842 


Hs.68879 


652 


bone 








morphogenetic 








protein 4 
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150 0.77 



151 0.76 

152 0.76 



153 0.76 



154 0.76 



155 0.76 



156 0.76 

157 0.76 

158 0.75 



159 0.75 

160 0.75 

161 0.75 



s2n_obs Perm non_norm_list GB/TIGR UNIGENE LL.num 
0.1% Identifier (as of 

summer 
2001) 

0.577 1827 s at M13929 



162 0.75 

163 0.75 

164 0.75 

165 0.75 



0.577 33103_s_at U37122 Hs.324470 120 
0.576 38247 at U67058 Hs.l68102 



0.576 31854 at AF035582 Hs.l51469 8573 



0.576 35932 at AF081507 



0.576 39540 at AF000561 Hs.104640 51341 



0.576 41713_at 

0.576 35444_at 
0.576 39219 at 



0.575 37672_at 

0.575 32502_at 
0.574 37423 at 



0.574 1445 at 



U09848 Hs.132390 7586 

AC004030 Hs.71779 
U20240 Hs.2227 1054 



Z72499 Hs.78683 7874 

AL041124 Hs.6748 81544 
U30246 Hs.l 10736 6558 



0.574 37720 at M22382 Hs.79037 3329 



AF014958 Hs.302043 9034 



0.574 36821_at AL050367 Hs.66762 

0.573 37188_at X92720 Hs.75812 5106 

1 



Desc 

(unigene/locusluik 
or affy) 

c-myc-P64 tniRNA, 
initiating from 
promoter PO, 
(HLmyc2.5) 
adducin 3 (gamma) 
Coagulation factor 
n (thrombin) 
receptor-like 1 
calcium/calmodulin 
-dependent serine 
protein kinase 
(MAGUK family) 
left-right 
detemiination, 
factor B 

HIV-1 inducer of 
short transcripts 
binding protein 
zinc finger protein 
36(KOX18) 
CosmidF21856 
CCAAT/enhancer 
binding protein 
(C/EBP), gamma 
ubiquitin specific 
protease 7 (herpes 
virus-associated) 
hypothetical protein 
PP1665 

solute carrier family 
12 

(sodium/potassium/ 
chloride 
transporters), 
member 2 
heat shock 60kD 
protein 1 
(chaperonin) 
chranokine (C-C 
motif) receptor-like 
2 

clone 

DKFZp564A026 
phosphoenolpyruvat 
e catboxykinase 2 
(mitochondrial) 
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Pprm non norm 


list GB/nGR 


UNIGENE LL_iiuii 




0.1% 


Identifier 


(as of 










summer 










2001) 




166 0.75 


0.573 37177_at 


Y00636 


Hs.75626 


965 


167 0.75 


0.573 31669_s_at 


AF039307 


Hs.249171 


3207 


1 6R 0 75 


0 573 35673 at 


U02082 


Hs.334 


7984 


169 0.75 


0.573 283_at 


L16842 


Hs.l 19251 


7384 


170 0.75 


0.572 35727_at 


AI249721 


Hs.39850 


54963 


171 0.74 


0.572 40445_at 


AF017307 


Hs.166096 


1999 


172 0.74 


0.572 1943_at 


X51688 


Hs.85137 


890 


i Id U. /*r 


0 572 39801 at 


AF046889 


Hs.153357 


8985 


174 0.74 


0.572 288 s at 


L25931 


Hs.l 52931 


3930 


175 0.74 


0.571 32320_at 


Z11502 


Hs.181107 


312 


1 7fi n 74 

X /O U. /*T 


0 571 37501 at 


Y07707 


Hs.119018 


55922 


177 0.73 


0.571 476_s_at 


U50079 


Hs.88556 


3065 


178 0.73 


0.571 864_at 


U07664 






179 0.73 


0.570 34046_at 


Z83844 


Hs.97858 


23616 


180 0.73 


0.570 1385_at 


M77349 


Hs.l 18787 


7045 


181 0.73 


0.570 31887_at 


J04469 


Hs.153998 


1159 


182 0.73 


0.570 36764_at 


AC004125 Hs.7235 


10368 


183 0.73 


0.570 35140_at 


R59697 


Hs.25283 


1024 


184 0.73 


0.570 367_at 


Z29067 


Hs.2236 


4752 



Desc 

(unigene/locuslink 
or affy) 

CD58 antigen, 
(lymphocyte 
fimction-associated 
antigen 3) 
homeo box Al 1 
Rho guanine 
nucleotide 
exchange factor 
(GEF) 5 
ubiquinol- 
cytochrome c 
reductase core 
protein I 

hypothetical protein 
FLJ20517 
E74-like factor 3 
(ets domain 
transcription factor, 
epithelial-specific ) 
cyclin A2 
procollagen-lysine, 
2-oxoglutarate 5- 
dioxygenase 3 
lamin B receptor 
annexin A13 
transcription factor 
NRF 

histone deacetylase 
1 

homeo box HB9 
hypothetical protein 
dJ37E16.5 
transforming 
growth factor, beta- 
induced, 681cD 
creatine kuiase, 
mitochondrial 1 
(ubiquitous) 
calcium chaimel, 
voltage-dependent, 
gamma subunit 3 
cyclin-dqjendent 
Idnase 8 
NIMA (never m 
mitosis gene a)- 
related kinase 3 



99 



0302927aA2_L> 



wo 03/029273 



PCT/US02/30797 



$2n obs Perm non norm list GB/TIGR UNIGENE LL num 



185 0.73 

186 0.73 

187 0.73 

188 0.73 

189 0.73 

190 0.72 

191 0.72 

192 0.72 



0.1% 

0.569 41276_at 

0.569 37562_at 

0.569 38630_at 

0.569 40123 at 



Identifier (as of 
summer 
2001) 

W27641 Hs.23964 10284 
LI 1370 Hs,79769 5097 
AL080192 Hs. 101282 
D87435 Hs.155499 8729 



0.569 32601 s at AC004382 Hs.279832 55715 



0.569 33573_at 

0.569 35656_at 
0.569 39876 at 



AB009426 

AJ010346 Hs.32597 6049 
AL035252 Hs. 12330 955 



193 0.72 0.569 2064 g at L20046 Hs.48576 2073 



194 0.72 0.569 40067_at M82882 Hs.l54365 1997 

195 0.72 0.568 34339_at AB009282 Hs.79103 80777 

196 0.72 0.568 38518_^at Y18004 Hs.l71558 10389 

197 0.71 0.567 37809 at U41813 Hs.l27428 3205 



Desc 

(unigene/locuslink 
or affy) 

sm3-associated 
polypq)tide, 18kD 
protocadherin 1 
(cadherin-like 1) 
clone 

DKFZp434B102) 
golgi-specific 
brefeldin A 
resistance factor 1 
small inducible 
cj^okine subfamily 
A (Cys-Cys), 
member 17 
apolipoprotein B 
niRNA editing 
enzyme, catalytic 
polypeptide 1 
ring finger protein 
(C3H2C3 type) 6 
ectonucleoside 
triphosphate 
diphosphohydrolase 
6 (putative 
function) 
excision repair 
cross- 
complementing 
rodent repair 
deficiency, 
complementation 
group 5 (xeroderma 
pigmentosum, 
complementation 
group G (Cockayne 
syndrome)) 
E74-like factor 1 
(ets domain 
transcription factor) 
cytochrome b5 
outer mitochondrial 
membrane 
precursor 

sex comb on midleg 
(Drosophila)-like 2 
homeo box A9 
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s2n_obs Perm non^normjist GB/TIGR UNIGENE LL^num Desc 

0.1 o/o Identifier (as of (unigene/locuslink 

summer or affy) 

2001) 

198 0.71 0.567 36613_.at U09585 Hs.315177 7866 interferon-related 

developmental 
regulator 2 

199 0 71 0.567 31324 at U82303 Hs.l23080 unknown protein 

niRNA 

200 0.71 0.567 308Xat J03756 Hs.65149 2689 growth hormone 2 



Table 7: CO Markers 

[00137] According to the invention, preferred markers are markers 1 -30, preferably 1 - 
20, and more preferably 1-10. 
Class: CO 

Desc 

(unigene/locuslink 
or affy) 

casein kinase 1, delta 
Aminopeptidase 
puromycin sensitive 
vascular endothelial 
growth factor 
fer-1 (C.elegans)- 
like 3 (myoferlin) 
BAG clone 
GS099H08 
integrin, alpha 3 
(antigen CD49C, 
alpha 3 subunit of 
VLA-3 receptor) 
ATP-binding 
cassette, sub-fannly 
C (CFTR/MRP), 
member 3 

DiGeorge syndrome 
critical region gene 2 
tumor suppressing 
subtransferable 
candidate 3 
syndecan 1 
serum constituent 
protein 

cyclinDl (PRADl: 
parathyroid 
adenomatosis 1) 





s2ii obs 


Perm 


non norm list GB/TIGR UNIGENE LL_num 






0.1% 




Identifier (as of 












summer 












2001) 




1 


0.81 


0.681 


493 at 


U29171 Hs.75852 


1453 


2 


0.8 


0.620 


39431_at 


AJ132583 Hs.293007 


9520 


3 


0.78 


0.599 


1953_at 


AF024710 Hs.73793 


7422 


4 


0.75 


0.584 


34678_at 


AL096713 Hs.234680 


26509 


5 


0.73 


0.570 


32919_at 


AC004010 Hs.121520 




6 


0.72 


0.545 


884_at 


M59911 Hs.265829 


3675 


7 


0.71 


0.531 


38261_at 


AF085692 Hs.90786 


8714 


8 


0.7 


0.528 


33889_s_at 


D79985 .Hs.2491 


9993 


9 


0.7 


0.524 


31888_s_at 


AF001294 Hs.154036 


7262 


10 


0.69 


0.522 


38127_at 


Z48199 Hs.82109 


6382 


11 


0.66 


0.514 


38132_at 


M88338 Hs.148101 


11135 


12 


0.65 


0.511 


2017_s_at 


M64349 Hs.82932 


893 
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X vl lU 


u uu uriii_iia i 


GB/TIGR UNIGENE LL num 








0.1% 




Identifier (as of 
summer 
9oni^ 




(unigene/locusUnk 
or affy) 




0 64 




D\JX.\fi. at (tl 


Mojy /O 




VctoCUlar CllUUlIlCllal 

growth factor 


14 


0.64 


0.509 


33354_^at 


AA63031 Hs.194477 
2 


64750 


E3 ubiquitin ligase 
SMURF2 


15 


0.64 


0.507 


32206_at 




Q876 


KIAA0451 gene 
pruciuoi 


16 


0.61 


0.499 


168 at 


U50196 Hs.94382 


132 


adenosine kinase 


17 


0.61 


0.492 


39962^at 


T TSO'^O^ TTc 447nR 
yj^yovj jio.*T*r/uo 


R476 

0*T /O 


Ser-Thr protein 

KIQaS6 rcialcQ LO 1116 

myotonic dystrophy 
protein kinase 


18 


0.6 


0.489 


33944_at 


oouuy y Jis.z / i^D i o 


334 


amyloid beta (A4) 
precursor-like 
protein 2 


19 


0.6 


0.488 


32094.at 






carbohydrate 
^cnonaroiim 
6/keratan) 
sulfotransferase 3 


20 


0.6 


0.486 


40504 at 


AF001601 Hs.169857 


5445 


paraoxonase 2 


21 


0.59 


0.485 


36117_at 


L13616 Hs.740 


5747 


PTK2 protein 
tyrosine kinase 2 




U.JO 


A ARO 
l/.*foU 




AB018356 Hs.225939 


8869 


sialyltransferase 9 
(CMP- 

NeuAc:lactosylcera 
imde aipna-z,3- 
sialyltransferase; 
GM3 synthase) 




0 f 7 


0 477 


'^^919 sit 


AF064801 Hs.28285 


11236 


paicneu reiaieu 
protein translocated 
in renal cancer 




U.D / 


U.4/0 


34 /yo_at 


jVOJO/y XlS.414/ 


Z34/1 


translocating chain- 
associating 
membrane protein 




U.30 


U.4/D 


4Uzzy_at 


AJ010071 Hs.153504 


10040 


target of mybl 
(chicken) homolog- 

UKC 1 


26 


0,55 


0.473 


34793_s_at 






plastiQ 3 (T isoform) 


97 






'^R64'^ at 




DD\JHl 


nypoineiicai proiem 


28 


0.55 


0.472 


35350_at 


AB011170 Hs.6079 


51363 


B cell RAG 
associated protein 


29 


0.55 


0.471 


38028_at 


AL050152 Hs.301914 


55885 


clone 

DKFZp586K1220 


30 


0.55 


0.471 


1030_s_at 


U07806 Hs.317 


7150 


topoisomerase 



(DNA)I 
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s2n_obs Perm noii_iiorm_list GB/TIGR UNIGENE LL_num 
0.1% Identifier (as of 

summer 
2001) 

0.469 37741 at M77836 Hs.79217 5831 



31 0.54 



32 0.54 0.469 35294 at M25077 Hs.554 6738 



33 0.53 

34 0.53 

35 0.53 

36 0,52 

37 0.52 

38 0.52 

39 0.52 



41 0.51 

42 0.51 

43 0.51 

44 0.5 



45 0.5 



0.468 38306^at 

0.467 33128__s_at 

0.463 40471_at 

0.462 31680_at 

0.460 41140_at 

0.459 33931_.at 

0.459 393 s at 



40 0.52 0.459 36036 at 



0.459 39411_^at 

0.459 33454__at 

0.458 33121 g at 

0.458 40093 at 



0.456 977 s at 



AA47757 Hs.94631 10565 
6 

W68521 Hs.83393 1474 

Y09048 Hs.168670 5824 

M55630 

U05875 Hs.177559 3460 



X71973 Hs.2706 2879 



X90976 Hs.129914 861 



J05500 Hs.47431 6710 



AL080156 Hs.12813 25976 

AF016903 Hs.273330 180 

AF045229 Hs.82280 6001 

X83425 Hs.155048 4059 



Z35402 Hs.194657 999 



Desc 

(unigene/locuslink 
or affy) 

pyrroline-5- 
carboxylate 
reductase 1 
Sjogren syndrome 
antigen A2 (60kD, 
ribonucleoprotein 
autoantigen SS- 
A/Ro) 

brefeldin A-inhibited 
guanine nucleotide- 
exchange protein 1 
cystatin E/M 
peroxisomal 
famesylated protein 
topoisomerase I 
pseudogene 2 
interferon ganuna 
receptor 2 
(interferon gamma 
transducer 1) 
glutathione 
peroxidase 4 
(phosphoUpid 
hydroperoxidase) 
runt-related 
transcription factor 1 
(acute myeloid 
leukemia 1; amll 
oncogene) 
spectrin, beta, 
erythrocytic 
(includes 
spherocytosis, 
clinical type I) 
DKFZP434J214 
protein 
agrin 

regulator of G- 
protein signalling 10 
Lutheran blood 
group (Auberger b 
antigen included) 
cadherin 1, type 1, 
E-cadherin 
(epitheUal) 
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s2n_obs 


Perm 


non_noriii_list GB/TIGR 


UNIGENE 


LL_num 


Desc 






0 1% 




Identifier 


yd!S 01 

summer 




^unigene/iocusuDK 
or affy) 


46 


0.5 


0.456 


33421 s at 




Hs.288031 


6309 


sterol-CS-desaturase 

rfllTiafll FTjrr'^ flialfo, 

5-desaturase)-like 


47 


0.5 


0.455 


39712_at 


AI541308 






^100 pQlrnim- 

binding protein A13 


48 


0.49 


0.452 


33894 at 


A T0 10046 


Hs.25155 


10276 


neuroepithelial cell 

ixaiisxoiiixiiig gene 1 


49 


0.49 


0.451 






Hs.80206 


2539 


glucose-6-phospliate 
dehydrogenase 


50 


0.49 


0.450 


32715_at 


N90862 


Hs. 172684 


8673 


vesicle-associated 
membrane protein 8 
(endobrevin) 


51 


0.49 


0.448 


/ J ox 


AT 046040 


Hs.250723 


79086 


hypothetical protein 


52 


0.49 


0.448 


AO'XO'X at 




Hs.61796 


7022 


transcription factor 

/\ir-Z ganmia 

(activating enhancer- 
binding protein 2 
ganiniaj 








39277_at 


U60805 


xiS.2ioo4o 


A1 OA 


oncostatin M 
receptor 


54 


0.48 


0.446 






Hs.7837 


10221 


phosphoprotein 
regulated by 
tniiogemc painways 


55 


0.48 


0.444 


"^RdO^ flt 




Hs.83086 




GT212mRNA 


56 


0 48 




291_s_at 


J04152 




4070 


unnor-associaieu 
calcium signal 
uonsuuccr ^ 


J / 


\J.HO 


Ci AAA 


34885_at 


AJ002308 


TTr. ^AQ'? 

jis.3uy / 


01 /I /! 

yi44 


synaptogyrin 2 


58 


0.48 


0.444 


j/uui_ai 


M23z!>4 


Hs.76288 


824 


calpain 2, (m/II) 
large subunit 


59 


0.48 


0.443 


AfiQOSl o+ 

*fUi^zo_ai 




Hs. 187991 


26118 


DKFZP564A122 
protein 


60 


0.48 


0.443 


ziin7R 5it 




Hs.98508 


23144 


KIAAOl 50 protein 


61 


0 47 


0 44"? 




A UA/1 1 OCO 


ns. i^^uf u 




zuic nnger proiem 
21/ 


62 


0.47 


0.442 


J /yiz_at 


VOAOAA 

AoUzUU 


Hs.8375 


9618 


TNF receptor- 
associated factor 4 


63 


0.47 


0.442 


36933_at 


D87953 


Hs.75789 


10397 


N-myc downstream 
regulated 


64 


0.47 


0.442 


35442 at 


AB007958 


Hs.169431 


57243 


KIAA0489 protein 


65 


0.47 


0.442 


33754_at 


U43203 


Hs.197764 


7080 


thyroid transcription 
factor 1 
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s2ii_obs 


Perm 


non nonii_list GB/TIGR 


UNIGENE LL_nu 






0.1% 




Identifier 


(as of 














summer 














2001) 




66 


0.47 


0.442 


34823_at 


X60708 


Hs.44926 


1803 


67 


0.47 


0.441 


35276_at 


AB000712 Hs.5372 


1364 


68 


0.47 


0.441 


40088_at 


X84373 


Hs.155017 


8204 


69 


0.46 


0.440 


1274 s_at 


L22005 


Hs.76932 


997 


70 


0.46 


0.440 


39698_at 


U51712 


Hs.13775 


84525 


71 


0.46 


0.440 


37103 at 


AF070610 


Hs.100543 




72 


0.46 


0.439 


39382_at 


AB011089 Hs.12372 


23321 


73 


0.46 


0.439 


37360 at 


U66711 


Hs.77667 


4061 


74 


0.46 


0.439 


32640_at 


M24283 


Hs.168383 


3383 


75 


0.45 


0.438 


38762_at 


AF083255 


Hs.8765 


11325 


76 


0.45 


0.438 


39021 at 


AB020684 Hs.ll217 


23333 


77 


0.45 


0.437 


35326_at 


AF004876 


Hs.5809 


10897 


78 


0.45 


0.437 


33942_s_at 


AF004563 


Hs.239356 


6812 


79 


0.45 


0.435 


32830 g at 


X97544 


Hs.20716 


10440 


80 


0.44 


0.435 


33448_at 


AB000095 Hs.233950 


6692 


81 


0.44 


0.434 


36201 at 


D13315 


Hs.75207 


2739 


82 


0.44 


0.434 


2035_s_at 


M55914 


Hs.284127 


4346 


83 


0.44 


0.433 


34759_at 


U68494 


Hs.24385 




84 


0.44 


0.433 


38819_at 


U33635 


Hs.90572 


5754 



Desc 

(unigene/locuslink 
or affy) 

dipeptidylpeptidase 
IV (CD26, adenosine 
deaminase 
complexing protein 
2) 

claudin 4 
nuclear receptor 
interacting protein 1 
cell division cycle 34 
hypothetical protein 
SMAP31 
clone 24505 
KIAA0517 protein 
lymphocyte antigen 
6 complex, locus E 
intercellular 
adhesion molecule 1 
(CD54), human 
rhinovirus receptor 
RNA hehcase- 
related protein 
KIAA0877 protein 
putative 
transmembrane 
protein; homolog of 
yeast Golgi 
membrane protein 
Yiflp (Yiplp- 
interacting factor) 
syntaxin binding 
protein 1 

translocase of inner 
mitochondrial 
membrane 17 (yeast) 
homolog A 
serine protease 
inhibitor, Kunitz 
type 1 

glyoxalase I 
MYC promoter- 
binding protein 1 
hbc647mRNA 
sequence 
PTK7 protein 
tyrosine kinase 7 
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Table 8; Other Markers 
Class: Other 





i9*(U vILr 


X vl III 


finn n#ii"m lie 

Htlil tim lU^Ua 




UNIGENE LL num 








0 1% 


t 

l> 


luenuiier 


(as of 




^umgeD c/ lociisiuiK 












sumiiicr 




or aiiyj 












^uux ^ 








0 46 


0 4'^fi 


60S flt 


A/ri259Q 


Tie i/^o/ini 
JlS.loy4Ul 


o4o 


dpoiipoproicin 


9 


U.'f J 


u.^z / 


iooj_s__ai 








xinaotnelial Cell 










HI ^*t*t 






Lrrowin r acior 1 


3 


0.45 


0.373 


35820 at 


Xfi2078 






VJIYL^ gdll^llUolUC 
















acnvaior proxein 


A 








My /yjo 


Xlo*Z> 11*0111 


6779 


transcription factor 


















«/ 


0 44 




'^79 10 flt 


V79755 


Hs.77367 


4283 


nionoiunc inaucea 
















by gamma interferon 


6 


0.43 


0.362 


33956_at 


AB018549 




Z^04J 


MD-2 protein 


7 






jHooj^ai 




TTc 97844'^ 
XIS.Z / OH*t^ 




low-ainmty igCi rc 
















reccpior ^oeia-rc- 
















gamma-Rn) 


u 


0 42 










1 ROO 


wnuuuicuax ecu 
















^OWlll laClOr 1 
















(platelet-uenvea) 


Q 


0 41 








TT<i 75580 


5^ 


acid phosphatase 2, 


10 


0 41 


0 353 




nR6Q61 


TTc 70900 


1 A1 54 


lysosomal 
upouia ruvL^ii^ 
















fusion partner-like 2 


1 1 


0 4 






T TR 1 ROO 






d^lii'f'A /^0't*v*iAi* 'ro-n^ilf r 

soiuie earner lanuiy 
















16 (monocarboxyUc 
















acid transporters). 
















meniDcr d 


12 


04 


0 350 


36753 at 


AFn79000 


TTc 67R46 

XIS.U / 0*tO 


1 1006 


1 01l1r/>/> Y/l'A 

x6UK.ocyie 
















ininiunogio Duiin-iiKe 
















icC/cpior, suDianiiiy 
















B (with IM and 
















1 1 JM domamsj. 
















member 4 


ID 








ArUD2lz4 






secreted 
















piiospnoproiem i 
















(osteopontin, bone 
















sialoprotem I, early 
















T-lymphocyte 
















activation 1) 


14 


0.38 


0.347 


37310_at 


X02419 


Hs.77274 


5328 


plasminogen 
















activator, urokinase 


15 


0.38 


0.346 


39008_at 


M13699 


Hs.296634 


1356 


ceruloplasmin 
















(ferroxidase) 


16 


0.37 


0.344 


35714_^at 


U89606 


Hs.38041 


8566 


pyridoxal 
















(pyridoxine, vitamin 
















B6) kinase 
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s2ii_ob 


Perm 


non nonn lis GB/TIGR 


UNIGENE LL_iium 


Desc 




s 


0.1% 


t 


Identifier 


(as of 
summer 




(unigeneAocusluik 
or affy) 












2001) 




CD 14 antigen 


17 


0.37 


0.344 


36661 s_at 


X06882 


Hs.75627 


929 


18 


0.36 


0.342 


38077_at 


X52022 


Hs.80988 


1293 


collagen, type VI, 
alpha 3 


19 


0.36 


0.340 


32488_at 


X14420 


Hs.l 19571 


1281 


collagen, type DI, 
alpha 1 (Emers- 
Danlos syndrome 
type IV, autosomal 
dominant) 


20 


0.36 


0.340 


39945_at 


U09278 


Hs.418 


2191 


fibroblast activation 
protein, alpha 


21 


0.36 


0.339 


128_at 


XS2153 


Hs.83942 


1513 


cathepsin K 














(pycnodysostosis) 


22 


0.36 


0.336 


31859_at 


J05070 


Hs.151738 


4318 


matrix 

metalloprotemase 9 
(gelatinase B, 92kD 
gelatinase, 92kD 
type IV coUagenase) 


23 


0.36 


0.335 


32306 g at 


J03464 


Hs.179573 


1278 


collagen, type I, 














alpha 2 


24 


0.35 


0.334 


40297_at 


AC005053 


Hs.61635 


26872 


six transmembrane 
epithenal antigen of 
the prostate 


25 


0.35 


0.333 


771_s_at 


D00749 






CD7 antigen (p41) 


26 


0.35 


0.331 


40496_at 


J04080 


Hs. 169756 


716 


complement 
component 1, s 
subcomponent 


27 


0.35 


0.329 


1184_at 


D45248 


Hs. 179774 


5721 


proteasome 
(prosome, 

macropain) activator 

subunit2^A28 

beta) 


28 


0.34 


0.329 


1717_s_at 


U45878 


Hs. 127799 


330 


baculoviral lAP 
repeat-contaimng 3 


29 


0.34 


0.329 


1039_s_at 


U22431 


Hs. 197540 


3091 


hypoxia-mducible 
factor 1, alpha 
subumt (basic helix- 
loop-helix 
transcription factor) 


30 


0.34 


0.328 


32193_at 


AF030339 


Hs.286229 


10154 


plexin CI 


31 


0.34 


0.328 


464_s_at 


U72882 


Hs.50842 


3430 


interferon-induced 
protein 35 


32 


0.34 


0.325 


41471_at 


W72424 


Hs.l 12405 


6280 


SI 00 calcium- 
binding protein A9 
(calgranulin B) 


33 


0.33 


0.325 


368_at 


Z29083 


Hs.82128 


10860 


5T4 oncofetal 
trophoblast 
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s2n_ob Perm 
s 0-1% 



non_norin_lis GB/TIGR 
t Identifier 



UNIGENE LL_num 
(as of 
summer 
2001) 



34 0.33 0.323 195 s at U28014 Hs.74122 837 



35 0.33 0.323 

36 0.33 0.322 



34386_at AF072250 
38631 at M92357 



Hs.35947 8930 
Hs.101382 7127 



37 0.33 0.321 37220 at M63835 



38 0.33 0.321 32700 at M55543 Hs.l71862 2634 



39 0.32 0.320 32434 at D10522 Hs.75607 4082 



40 0.32 0.320 34666 at 



0.32 0.320 
: 0.32 0.319 
0.32 0.319 



1633_g_at 
39827_at 
231 at 



X07834 



U77735 

AA522530 

M55153 



Hs.3 18885 6648 



Hs.80205 11040 
Hs.l 11244 54541 
Hs.8265 7052 



. 0.32 0.319 
! 0.32 0.318 



35474_s_at Y15915 
40712 at D26579 



Hs.172928 1277 
Hs.86947 101 



i 0.32 0.317 1042 at U27185 Hs,82547 5918 



0.32 0.317 37922 at L02648 Hs.84232 6948 



; 0.32 0.316 
► 0.32 0.315 



35816__at U46692 
38111 at X15998 



Hs.695 1476 
Hs.81800 1462 



Desc 

(unigene/Iocusilnk 
or affy) 

glycoprotein 
caspase 4, apoptosis- 
related cysteine 
protease 

methyl-CpG binding 
domain protein 4 
tumor necrosis 
factor, alpha-induced 
protein 2 

Fc fragment of IgG, 
high affinity la, 
receptor for (CD64) 
guanylate binding 
protein 2, interferon- 
inducible 
myristoylated 
alanine-rich protein 
kinase C substrate 
(MARCKS, 80K-L) 
superoxide 
dismutase 2, 
mitochondrial 
pim-2 oncogene 
hypothetical protein 
transglutaminase 2 
(C polypeptide, 
protein-glutamine- 
gamma- 

glutamyltransferase) 
collagen, type I, 
alpha 1 

a disintegrin and 
metalloproteinase 
domain 8 

retinoic acid receptor 
responder 

(tazarotene induced) 
1 

transcobalamin 11; 
macrocytic anemia 
cystatin B (stefin B) 
chondroitin sulfate 
proteoglycan 2 
(versican) 
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Table 9 - Group 1 



Rank 


s2n V. s2n v. 


Geiibank_or_tigi 


Description 




Feature 






1 


0.89 0.57 493_at 


U29171 


casein kinase 1, delta 


2 


0.80 0.53 3943 l_a 


AJ132583 


puromycin sensitive aminopeptidase 


3 


0.78 0.52 1953_at 


AF024710 


vascular endothelial growth factor 








(VEGF) 


4 


0.75 0.52 34678_at 


AL096713 


fer-l (C. elegans)-like 3 (myoierlm) 


5 


0.74 0.51 36100_at 


AF022375 


vascular endothelial growth factor 








(VEGF) 


6 


0.73 0.51 32919_at 


AC004010 


BAG clone GS099Ji06 


7 


0.72 0.50 884_at 


M59911 


mtegrm, alpha 3 (CD49C antigen) 


8 


0.71 0.49 38261_at 


AF085692 


ATP-binding cassette, sub-family C 






(CFTR/MRP) 


9 


0.70 0.49 


AF001294 


tumor suppressing subtransferable 




31888 s at 




condidate 3 


10 


0.69 0.48 38127 at 


Z48199 


syndecan 1 


11 


0.69 0.46 


D79985 


DiGeorge syndrome critical region 




33889 s_at 




gene 2 


12 


0.66 0.46 38132 at 


M88338 


serum constituent protein 


13 


0.65 0.45 201 7_s_at 


M64349 


cyclinDl (PRADl: parathyroid 






adenomatosis 1) 


14 


0.64 0.45 


M63978 


vascular endothelial growth factor 


36101_s_at 




(VEGF) 


15 


0.64 0.45 33354_at 


AA630312 


■■—1 M "i* CyK ITT lit 

E3 ubiquitin hgase SMURF2 


16 


0.64 0.45 32206_at 


AB007920 


KIAA0450 gene product 


17 


0.64 0.44 1930_at 


U83659 


ATP-bmdmg cassette, sub-family C 




(CFTR/MRP) 


18 


0.64 0.44 40237_at 


AF035444 


tumor suppressmg subtransferable 






candidate 3 


19 


0.61 0.44 168_at 


U50196 


Adenosme Kinase 


20 


0.61 0.44 39962_at 


U59305 


ser-thr protein kinase PK428 


21 


0.60 0.44 33944_at 


S60099 


Amyloid beta (A4) precursor-like 








protein 2 


22 


0.60 0.44 32094_at 


AB017915 


condoroitm 6-sulfotransferase 


23 


0.60 0.44 40504_at 


AF001601 


paraoxoriase 2 


24 


0.59 0.44 361 17_at 


L13616 


PTK2, focal adhesion kinase 


25 


0.59 0.44 40229_at 


AJ010071 


target of mybl-like 



Class - CM 



Rank 


s2n V. s2n v Feature 


Genbank or tigi 


1 


2.29 0.84 40392 at 


U51096 


2 


1.99 0.64 170_at 


U51096 


3 


1.60 0.64 40736_at 


X83228 


4 


1.55 0.63 37124 J_at 


J04813 



Description 
caudal type homeo box transcription 
factor 2 

caudal type homeo box transcription 
factor 2 

cadherini 17, LI cadherin (liver- 
intestine) 

cytochrome P450, subfamily IDA 
(niphedipine oxidase) 
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szn V. szn v reature 


Genbank or tigi 


ilescnptioii 






TT^l AO^ 


cauQai type nomeo dua LTdiiburipLioii 








lacior 1 


O 


1.4o U.OU 4UU4j_al 


Y71 'XA^ 


scnnc proiease, irypsinogcu 1 v 


1 


1 /in n <o 'x^fiAA rt+ 
1.4U U.jy 3!)o44__at 


A DAI A COO 

AiiU145yo 


Hephaestin 


Q 

(5 


v,Do u.jy ^zy/z_ai 


z«ojoiy 


JNAUrxi oxiaase 1 


Q 




TVifl AA4IA 
iVLlUUDU 


lauy aciG Dmaing proiein 1, iiver 




i.jj u.Do jyyDi ai 


T 7AC7/i 
IjZUoZO 


plastin 1 (I isofoim) 


1 1 
i 1 






uarcineonioryomc atingcii-reiaiea ceii 








aanesion moiccuie i 


12 


1.30 0.57 1229 at 


U785566 


Cisplatin resistance associated 


13 


1.30 0.57 3741 5_at 


AB018258 


ATPase, Class V, type lOB 


1 A 

14 


1 .Z / U.D / 41 /Uo__at 


AUUzoyj / 


KiAAlU34 protein 




1,ZZ U.DO /OD_S_ai 


A'DAAA7Q1 
AidUUO/oI 


galectin 4 


10 


1 .zz u . J 0 4uoy4_at 


Y7'3^A7 


cytokeratin 20 


1 n 
1 / 


i.zu U.JO 3yoy/_at 


Uzo /ZO 


hydroxysteroid (11 -beta) dehydrogenase 

7 


15 


1 7A A ^QQAvl of 

I.ZU U.DO jjyu4_ai 


A"DAAA71 A 
AdUUU / 14 


z 

ciauuin D 


1 O 

ly 


1 7A A 'J'3^<0 of 
I.ZU U.DO DD0Jy_2X 


T K 1 /1 1 7 

Uoi41z 


protein tyrosine kinase PTK6 


9n 
zu 


1 1 Q A AI 7^A of 
1.1 !7 U.jO 'tlZOO ai 


ADjDoO 


miegnn, aipna o 


21 


1.19 0.55 35415 at 


X12901 


villin 1 


22 


1.19 0.55 36170 at 


D83198 


protein expressed in thyroid 


23 


1.18 0.55 37847 at 


AB006955 


PDZ-73 protein 


24 


1.16 0.55 34595_at 


AF105424 


myosin lA 


ZD 


1 A ^717^ f* of 

1.10 u.jd j/izj_i_ai 


JU4olO 


cyiocnrome Jr4DU, suDiamiiy iii/\ 








^nipneoipine oxiaase^ 












szii V, szn V reaiurc 


vvcn D anK_or_iigi 


jL/escripnun 


1 
1 


1 7Q A '^A/t<I7 of 

i.zy u.oD jD4j /_ai 


TT1 ao/;a 
UlUoOU 


guanine monophosphate synthetase 


Z 


1 7^ A 70 ACVK 1 7 of 

1 .ZD u. /y 4U1 1 /_ai 


r%QA^^7 
Lio4jD / 


jvuiucnromosome mainienance aencieni 








(misD, 0. rombe) o 




1 .zz U. / J 5 155 l_2X 


A 1 OAO A An 

A10U3447 


small nuclear ribonucleoprotein 








poiypepnae ijr 


ii 
4 


1 71 A 71 /II ^A7 of 
l.Zl \jJd 4ij4/_ai 


AT7A^7/I77 
ArU4/4/Z 


i5\jd5 nomolog 


f 


1 1 7 A i^Qk 1 A^'^ rr 0+ 

1.1 / u.oy iUDD_g__at 


Mo /iiy 


replication factor C 


0 


1 17A/^0'1QQAA e 0* 

1.1 / u.oy joo4u_s^at 


T 1 A/^7Q 

LlUo/o 


protlun z 


7 


1 1 A A ^52 '^'^970 of 


AT AQ#?71 0 
/vLfUyO / ly 


proniin z 


Q 


1 17 A /^C 7QA/C^ 

l.lz U.OO JoUoD_at 


V/CO CI /I 

Aoz!>i4 


high-mobiUty group protein 2 




111 A /CO '7AQ rtf 

1.11 U.OO /uy_at 


J UUi 14 


tubulin, beta polypeptide 






AP00zt770 


iiap suiicture-speciiic enaonucicasc i 


1 1 


1 A7 A fn 'XAn^'x c of 
1 .U / U.O / j*r / 0 j_s___ai 




DxjDD nomoiog 


IZ 


1 AA A ^7 1 S7A e of 

1 .uo u.o / 1 oZ4_s_ai 


TAC^I A 


proiiieraimg cell nuciear aniigen ^jtv^jnaj 




1 OS 0 6S 401 fl- 


y\.x*to JV 


n.^r\. iiioiuiic idiiiiiy, xiiciiiud yv 


14 


1.05 0.65 39109 a 


AB024704 


chromosome 20 open reading frame 1 


15 


1.05 0.65 207_at 


M86752 


stress-induced-phosphoprotien 1 (Hsp70/Hsp90 








organizing protein) 


16 


1.04 0.65 1884 s at 


M15796 


proliferating cell nuclear antigen (PCNA) 


17 


1.03 0.64 34763_a 


AF020043 


chondroitin sulfate proteoglycan 6 (bamacan) 



no 

BNSDOCID: <WO__0302g273A2J_> 



) 

wo 03/029273 



) 

PCTAJS02/30797 



1 o 

18 


1.03 0.64 572_at 


■« If c\ /' r\f\ 

M86699 


TTK protein kinase 


19 


1.02 0.64 40619_a 


M91670 


ubiquitin earner protein 


on 


i.UU U.oJ IM s at 




r is. J uo-Dinaing protein i a ^ i zkjj ) 


21 


1.00 0.63 1803 at 


X05360 


cell division cycle 2, Gl to S and G2 to M 


22 


0.99 0.63 1515 at 


HG4074-HT4344 


Rad2 


23 


0.98 0.63 34791 a 


X52882 


t-complex 1 


24 


0.97 0.63 40690 a 


X54942 


CDC28 protein kinase 2 


25 


0.96 0.63 37686 s at 


Y09008 


uracil-DNA glycosylse 



Class -C2 



Rank 


S2n V. S2n v. 


Genebank or tigi 


Description 




Feature 






1 


1.46 0 77 40035 a 


AB012917 


kallikrein 1 1 

XV*'^ III i XXX X X 


2 


1.28 0.65 


L08424 


achaete-acute comlex homolog-like 1 




40544 s at 






3 


1.27 0.59 36606_a 


X51405 


carboxypeptidase E 


4 


1.21 0 59 31477 a 


L08044 


trefoil factor 3 flntestuian 

%/JLXZXX X*i#%/^X^X 1 AXXV^r^Jfc tXX%*X M 


5 


1.19 0.58 362991a 


X02330 


calcitonin/calcitonin-related polypeptide 


6 


1.17 0.57 40649_a 


X64810 


proprotein convertase subtilisin/kexin type 1 


7 


1.16 0.57 40543 a 


L08424 


achaete-acute complex homolog-like 1 


8 


1.16 0.57 442 at 


X15187 


tumor rejection antigen (gp96)l 


9 


1.11 0.56 


AI985964 


trefoil factor 3 Hntestinal^ 

ikX^rJL^/XX 1 fclXy 1«X^X \ X g 1 If^/fcJlf 1 1 l%ifcX 9 




37897_s_at 






10 


1.06 056 36300 a 


XI 5943 


calcitonin/calcitonin-related polypeptide 


11 


1.02 0.56 39332 a 


AF035316 


tubulin, beta polypeptide 


12 


0.97 0.55 


Z93930 


X-box binding protein 1 




39756 g at 






13 


0.96 0.54 39135 a 


ABO 183 10 


KIAA0767 protein 


14 


0.95 0.54 34785 a 


AB028948 


KIAA1025 protein 


15 


0.92 0.53 37617 a 


U90912 


KIAAl 128 protein 


16 


0.87 0.53 39755 a 


Z93930 


X-box binding protein 1 


17 


0.85 0.53 37928 a 


AA621555 


nuclear transcription factor Y, beta 


18 


0.85 0.53 1788 s at 


U48807 


dual specificity phosphatase 4 


19 


0.84 0.53 35995 a 


AF067656 


ZWIO Interactor 


20 


0.84 0.53 37141 a 


U39840 


hepatocyte nuclear factor 3, alpha 


21 


0.83 0.53 40201 a 


M76180 


dopa decarboxylase 


22 


0.82 0.52 1823 g at 


HG4677-HT5102 


Oncogene Ret/Ptc2 


23 


0.82 0.52 35800 at 


D63391 


platelet-activating factor acetylhydrolase 


24 


0.81 0.52 1822 at 


HG4677-HT5102 


Oncogen Ret/Ptc2 


25 


0.81 0.52 37426 at 


U80736 


trinucleotide repeat containing 9 


Class C3 






Rank 


52n V. 52n v Feature 


Genebank_or_tigi 


Description 


1 


1.42 0.67 37669 s at 


U16799 


Na4-/K+ transporting ATPase 


2 


1.20 0.61 36066 a: 


AB020635 


KIAA0828 protein 


3 


1.17 0.60 33699 a: 


Ml 8667 


pepsinogen C gene 


4 


1.06 0.58 1081 at 


M33764 


Ornithine decarboxylase 1 
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Rank 


52n V. 5211 v Feature 


Genebank_or__tigi 


i/escripuoii 


5 


LOo 0.57 3339o_a: 


TT1 O/fTfO 

U 124/2 


Oiuiauiioiic L>~udiiDicra£>c pi 


6 


l.Oo 0.57 34319_a: 


A A 1 'il 1 /lO 

AAiil i4y 


o luu caiwi mxi piimin^ pru 10111 r 


7 


1 A/1 A c.ci. oon n r^. 

1.04 U.5d oz9_s_a: 


TT01 /^QO 


ijluiaujioiic o^'iidiioicrcioc pi 


o 
o 


1 AO A CC '3'7AA/I «. 

1.U2 U.55 3/Uu4_a: 


T AO '7/^1 
JUZ/Ol 


x UlIIlUIldiy^aooUL'la.tCll oUlJ.ctUkaJJ.L 


9 


1 AO A CC ylAilAn 

1.02 0.55 4U4uy__a: 


U400o9 


xViQcnydw ucnyurogciid.sc 0 idiuiijr 


1 A 
lU 




UUDoOi 


aiuO JSoCtU iCiXUv^iao^ idi 1 llljf 1 


11 


1 AA A CO '3^0A'2 o. 

l.uU U.Dz 3ozUJ_a. 


AIDZ / / 


wlillLXllllC U.CUa.lUL9AyXaoC 1 


12 


A AA A CO '5^^0'3 ^* «^ 

0.99 0.52 333o3_i-at 


A 1 QOA'TI Q 

Alo2U/lo 


ivciiiioic aCiQ rcccpior 


13 


A AA A CI 'i'^ACO 

0.99 0.51 33052_a: 


TTOC^AI 

U95iUi 


x^nospnoiipasc /\z 


14 


0.98 0.51 35207_a: 


V^iCI OA 

a7o1oO 


DOCuum Cuannci, nonvoiiage-gaica i 








aipUd. 


15 


A AO A CI OOCO/C 

0.9o 0.51 3o52o_a: 


T TAOOOO 


C/ALYLl — ^spccinc paospnoaicsicraac 


16 


A A*? A CI lOA^/C n. 

0.97 0.51 3oUoo_a: 


TV jTC 1 AA 


iN/\l-I^Jr ^Jl-qillllOIlC UAirCUUC/laoC 


17 


A C\1 A CI 1 OOO ^ rt^- 

0.93 0.51 lo52__g_at 


rlA40!> o-il 1 4oZo 


x^usioii aciivaLcu wiit/Ugciic /Tjxiii"jjfVi~. 


IS 


AAO A C 1 ITT^A 

.093 0.51 3111y_dX 


VA0 11/1 

Y0ol34 


aciCL spiiiiigoinyciuiadC'iiKc 








pnospnoaiesicrasc 


19 


0.92 0.50 38773_at 


A T>AAO 1 C 1 

AB00315i 


carbonyl reductase 1 


20 


f\ AA A r'A 'n r\r\ — 

0.90 0.50 700_s_at 


HCj J / 1 -M 1 2o3 oo 


jviucui 1, jjfpiuieiiiai 


01 


A 550 n '^^Q'^R at 
v/,o!7 v.jU jOyDO ax 




nhn^nliAlinafse A2 ctoitd IVA 


22 


0.88 0.50 38986_at 


Z49835 


glucose regulated protein, 58kD 


23 


0.88 0.5040685_at 


U10868 


aldehyde dehydrogenase 3 family. 








member Bl 


24 


0.87 0.49 41267 at 


AB028972 


KIAA1049 protein 


25 


0.86 0.49 34839_at 


AB029027 


KIAA1104 protein 



Class NL 



Rank 


s2ii v. s2n V. 


Geiibank_or_tigi 


Description 




Feature 






1 


1.97 0.61 32542 at 


AF063002 


four and a half LIM domains 1 


2 


1.92 0.59 1815_g at 


D50683 


TGF-beta n receptor 


3 


1.82 0.58 36119 at 


AF070648 


clone 24651 mPlNA 


4 


1.75 0.57 35868_at 


M91211 


advanced glycosylation end product- 








specific receptor 


5 


1.71 0.56 39031 at 


AAl 52406 


C)'tochrome c oxidase 


6 


1.70 0.56 37398 at 


AA100961 


CD31 antgen 


7 


1.70 0.56 40607_at 


U97105 


Dihydropyrimidinase-like 2 


8 


1.70 0.5640841_at 


AF049910 


Transforming, acidic coiled-coil containing 








protein 1 


9 


1.69 0.55 4033 l_at 


AF035819 


Macrophage receptor with collagenous 








structure 


10 


1.68 0.55 


X15606 


Intercellular adhesion molecule 2 




38454_g_at 






11 


1.65 0.55 36569_at 


X64559 


tetranectin (plasminogen-bindrag protem) 


12 


1.63 0.55 39066 at 


L38486 


Microfibrillar-associated protein 4 


13 


1.60 0.54 


M84526 


adipsin/complement factor D 




40282 s at 






14 


1.60 0.5434320_at 


AL050224 


polymerase I and transcript release factor 
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Rank 


s2n V. s2n v. 


Genbank_or_tigi 


Description 




Feature 






15 


1.60 0.54 37027 at 


M80899 


AHNAK nucleoprotein (desraoyoidn) 


16 


1.58 0.54 33328 at 


W28612 


EST 


17 


1.58 0.54 1814 at 


D50683 


TGF-beta 11 receptor 


18 


1.58 0.54 35985 at 


AB023137 


A kinase (PRKA) anchor protem 2 


19 


1.57 0.53 38177 at 


AJ001015 


RAMP2 


20 


1.57 0.53 39775_at 


X54488 


Cl-Inmoitor 


21 


1.57 0.53 770 at 


D00632 


glutathione peroxidase 3 


22 


1.54 0.53 39760_at 


AL031781 


KH domain RNA bmding protem 


23 


1.54 0.53 268_at 


L34o57 


platelet/endothelial cell adhesion molecule- 








1 (PECAM-1) 


24 


1.53 0.52 3375o_at 


T TO t\ A A ^ 

U39447 


amine oxidase (vascular adhesion protem 1) 


25 


1 52 0 52 40419 at 


X85116 


erythrocyte membrane protein band 7.2 








(stomatin) 


Class — 


C5 






Rank 


s2n V. s2n v Feature 


Genbank or tigi 


Description 


1 


1.06 0.73 1411_at 


D16154 


P-450cll 


2 


1.04 0.70 37021 at 


•XT 1 jfOO O 

Xlo832 


Catliepsin H 


3 


1.02 0.70 534 s at 


U20391 


folate receptor 1 (adult) 


4 


0.95 0.69 38394 at 


D42047 


KIAA0089 protem 


5 


0.94 0.67 


JVlOo941 


Protein tyrosine phosphatase 




1460 g at 






6 


0.92 0.67 33331 at 


U17077 


BENE protein 


7 


0.91 0.65 38336 at 


AB023230 


K1AA1013 protem 


8 


0.89 0.65 31883 at 


AF025794 


Methioimie synthase reductase (MTRR) 


9 


0.88 0.65 35016 at 


M13560 


la-associated invariant gamma-chain 


10 


0.88 0.65 37512_at 


U89281 


Oxidative 3 alpha hydroxysteroid 








dehydrogenase 


11 


0.87 0.64 


HG3187-HT3366 


Tyrosme Phosphatase 1, Non-Receptor 




1629 s at 






12 


0.86 0.64 


T 1C\C\AC 


Cytochrome d5 (CYBS) gene 




38459_g at 






13 


0.86 0.64 34139 at 


AL049651 


Somatostatin receptor 4 


14 


0.86 0.63 36965_at 


U1361o 


Ankyrm G (ANK-3) 


15 


0.85 0.63 130 s at 


X82850 


Thyroid transcnption factor 1 


16 


0.85 0.63 593_s_at 


M34353 


v-ros avian UR2 sarcoma virus oncogene 








homolog 1 


17 


0.85 0.63 33278 at 


AC004381 


SA (rat hypertension-associated) homolog 


18 


0.85 0.63 821_s_at 


U78793 


folate receptor alpha (hFR) 


19 


0.82 0.63 40617 at 


AC0043S1 


Hypothetical protem 1^LJ20274 


20 


0.82 0.63 35792 at 


U67963 


Lysophospholipase-like 


21 


0.80 0.63 38785 at 


■XT' ^ O O 

X52228 


mucin 1, transmembrane 


22 


0.80 0.63 33967_at 


M31525 


major histocompatibility complex, class n 




U.oU 34iyc5 at 


U12128 


APO-1/CD95 (Fas)-associated phosphatase 


24 


0.80 0.62 33584 at 


T TO C\ A 

U35146 


CDC2-related kinase 


25 


0.80 0.62 33249_at 


M16801 


Nuclear receptor subfamily 3, group C, 








member 2 
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[00138] The invention may be embodied in other specific fomis without departing 
from the spirit or essential characteristics thereof. The foregoing embodiments are therefore 
to be considered in all respects illustrative rather then limiting on the invention described 
herein. Scope of the invention is thus indicated by the appended claims rather than by the 
foregoing description, and all changes which come within the meaning and range of 
equivalency of the claims are intended to be embraced therein. 
[00139] Each of the patent documents and scientific publications disclosed 
hereinabove is incorporated by reference herein in its entirety. 
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CLAIMS 

11. A method for classifying lung carcinomas on the basis of gene expression, the method 

2 comprising the steps of: 

3 a) assaying an expression level for each of a plurality of genes in a plurality of 

4 lung carcinoma samples; and, 

5 b) performing a clustering analysis on the expression levels of step a), 

6 thereby identifying classes of lung carcinomas on the basis of gene expression. 

1 2. The method of claim 1 , wherein said clustering analysis is selected from the group 

2 consisting of hierarchical clustering and probabiUstic clustering. 

13. A method for diagnosing a type of lung carcinoma, the method comprising the steps of: 

2 a) assaying an expression level for each of a predetermined number of markers of lung 

3 carcinoma in a lung carcinoma sample; and, 

4 b) identifying said lung carcinoma as a predetermined type of lung carcinoma if at least 

5 one of said expression levels is greater than a reference expression level. 

1 4. The method of claim 3, wherein said predetermined number is between 2 and 50. 

1 5. The method of claim 3, , wherein said predetermined number is greater than 50. 

1 6. The method of claim 4 or 5, wherein said markers of lung carcinoma are markers of at 

2 least two different types of lung carcinoma. 

1 7. The method of claim 3, wherein said type of lung carcinoma is selected from the group 

2 consisting of metastatic cancers of non-lung origin, small cell lung carcinomas and non-small 

3 cell lung carcinomas. 

1 8, The metiiod of claim 7, wherein said non-small cell lung carcinoma is selected from the 

2 group consisting of adenocarcinomas, squamous cell carcinomas, and large cell carcinomas. 

1 9. The method of claun 8, wherein said adenocarcinomas are selected from the group 

2 consisting of classes CI, C2, C3, and C4. 

1 10. The method of claim 3, wherein said markers are selected from the group consisting of 

2 the genes shown hi Tables 1-4. 

1 11. The method of claim 10, wherem said markers are selected from the group consisting of 

2 kallikrein 1 1 , achaete-scute complex (Drosophila) homolog-like 1 , carboxypeptidase E, trefoil 
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3 factor 3 (intestinal), calcitonin/calcitonin-related polypeptide alpha, proprotein convertase, dual 

4 specificity phosphatase 4, and dopa decarboxylase. 

1 12. The method of claim 3, further comprising the step of providing a prognosis for a patient 

2 based on the identification of the type of lung carcinoma. 

1 13 . The method of claim 3, fiarther comprising the step of recommending a treatment for a 

2 patient based on the identification of the type of lung carcinoma. 

1 14. The method of claim 13, wherein said treatment is tailored to the type of lung carcinoma. 

1 15. A method for detecting lung carcinoma in a patient, the method comprising the steps of: 

2 a) assaying an expression level for a predetermined number of markers for lung 

3 carcinoma in a patient sample; and, 

4 b) detecting the presence of a lung carcinoma if at least one of said expression levels 

5 is greater than a predetenmned reference level. 

1 16. The method of claim 15, wherein said predetermined number is between 2 and 50. 

1 17. The method of claim 15, wherein said predetemiined number is greater than 50. 

1 18. The method of claim 1 5 or 1 6, wherein said markers of lung carcinoma are markers of at 

2 least two difierent types of lung carcinoma. 

- 1 19. The method of claim 15, wherein said type of lung carcinoma is selected fi-om the group 

2 consisting of metastatic cancers of non-lung origin, small cell lung carcinomas and non-small 

3 cell lung carcinomas. 

1 20. The method of claim 1 9, wherein said non-small cell lung carcinoma is selected firom the 

2 group consisting of adenocarcinomas, squamous cell carcinomas, and large cell carcinomas. 

1 21 . The method of claim 20, wh^ein said adenocarcinomas are selected firom the group 

2 consisting of classes CI, C2, C3, and C4. 

1 22. The method of claim 15, wherein said gene is selected firom the group consisting of the 

2 genes shown in Tables 1-4. 

1 23. The method of claim 22, wherein said markers are selected fi-om the group consisting of 

2 kallikrein 11, achaete-scute complex (Drosophila) homolog-like 1, carboxypeptidase E, trefoil 

3 factor 3 (intestinal), calcitonin/calcitonin-related polypeptide alpha, proprotein convertase, dual 

4 specificity phosphatase 4, and dopa decarboxylase. 

1 24. The method of claim 15, fiulher comprising the step of providing a prognosis for a 

2 patient based on the identification of the type of lung carcinoma. 
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1 25. The method ofclaim 15, further comprising Ihe step ofrecoimnendingatreatment for a 

2 patiOTt based on the identification of the type of lung carcinoma. 

1 26. The method ofclaim 25. wherem said treatment is tailored to the type of lung carcinoma. 

1 27. A diagnostic array comprising: 

2 a) a solid support; and 

3 b) a pluraUty of diagnostic agents coupled to said solid support, wherein each of said 

4 agents is used to assay the expression level of a specific marker of lung carcinoma. 

1 28. The array ofclaim 27, wherein each ofsaid diagnostic agents is selected from the group 

2 consisting of PNA, DNA, and RNA molecules that specifically hybridize to a transcript fixim a 

3 marker of lung carcinoma. 

1 29. The array of claim 27, wherem each of said diagnostic agents is an antibody that 

2 specifically binds to a protein expression product of a marker of lung carcinoma. 

1 30. The array of claim 28 or 29, wherein said marker of lung carcinoma is a gene selected 

2 from the group consisting of the genes shown in Tables 1-4. 

1 31. The array of claim 30, wherein said lung carcinoma is an adenocarcinoma, and said 

2 marker is selected from the group consisting of kallikrein 1 1 , achaete-scute complex 

3 (Drosophila) homolog-like 1, carboxypeptidase E, trefoil factor 3 (intestinal), 

4 calcitonin/calcitonin-related polypeptide alpha, proprotem convertase, dual specificity 

5 phosphatase 4, and dopa decarboxylase. 

1 32. A diagnostic array consisting of: 

2 a) a soUd support; and 

3 b) a plurality of diagnostic agents coupled to said solid support, wherein each of said 

4 agents is used to assay the expression level of a specific marker of lung carcinoma. 

1 33. The array of claim 27 or 32, wherein said plurality comprises diagnostic agents 

2 characteristic of at least two types of lung carcinoma. 

1 34. A system for maintaining lung cancer marker expression levels, the system comprising a 

2 memory device comprising a reference expression level for at least one marker of lung 

3 carcinoma. 

1 35 . The system of claim 34 fiirther comprising a referaice expression level for at least one 

2 marker of normal lung. 
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1 36. The system of claim 34, wherein each marker is selected from the group consisting of the 

2 genes shown in Tables 1-4. 

1 37. The system of claim 35, wherein each marker is selected from the group consisting of 

2 kallikrein 11, achaete-scute complex (Drosophila) homolog-like 1, carboxypeptidase E, trefoil 

3 factor 3 (intestinal), calcitonin/calcitonin-related polypeptide alpha, proprotein convertase, dual 

4 specificity phosphatase 4, and dopa decarbpxylase. 

1 38. The system of claim 35, wherein said memory device is selected from the group 

2 consisting of tapes, discs, RAM, ROM, and CDROM. 

1 39. A computer disk comprising reference expression levels for a plurality of markers of lung 

2 carcinoma. 

1 40. A computer disk comprising a plurality of markers of lung carcinoma. 

1 41 . A method for evaluating a drug candidate, the method comprising the steps of: 

2 a) assaying an expression level for each of a predetermined number of lung cancer 

3 marker genes in a cell sample; 

4 b) exposing the cell sample to a drug candidate; 

5 c) assaying an expression level for each of the mark^ genes in the presence of the 

6 drug candidate; and 

7 d) identifying a positive dmg candidate as one that decreases expression of at least 

8 one of said marker genes. 

1 42. A method for monitoring drug treatment of a patient with lung cancer, tiie method 

2 comprising the steps of: 

3 a) administering a drug to a patient with lung cancer; and 

4 b) assaying the expression level of a predetermined niunber marker genes, wherein 

5 the expression level of the marker genes is an indicator of the disease status of the patient. 

1 43. A method for classifying a lung carcinoma, the method comprising the steps of: 

2 a) assaying a gene expression profile of a Ixmg carcinoma sample; 

3 b) comparing the gene expression profile of step a) with a reference expression 

4 profile characteristic of a known Ixmg carcinoma type; and 

5 c) assigning the lung carcinoma sample to a known lung carcinoma type based on 

6 the comparison of step b). 
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As all searchable clauns could be searched without effort justifying an additional fee, this Authority did not invite 
payment of any additional fee. 

As only some of the required additional search fees were timely paid by the applicant, this international search report 
covers only those claims for which fees were paid, specifically clauns Nos.: 



No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 1-26 and 43 drawn to the single 
Class CI, 1 maricer for GMP. U10860 



Remark on Protest | | The additional search fees were accompanied by the applicant's protest. 




No protest accompanied the payment of additional search fees. 
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BOX n. OBSERVATIONS WHERE UNITY OF INVE^mON IS LACKING 

Groups 1^33 Claims 1-26 and 43. drawn to methods of classifying lung tumors, detecting and subsequently diagnosmg lung carcinoma 
in a patient, and recommending treatment, all by assaying die expression level of the same predetermined irarker chosen from Tables 1- 
4(C1-C4 markers). For example, if applicant elects Group 1. then the methods of Claims 1-26 and 43 will be searched as they apply to 
the expression of a single marker outlined in Tables 1-4, guanine monophosphate synthetase(U10860). Sinularly. if applicant elects 
group 201, claims 1-26 and 43 will be searched as they apply to the marker for kallikein 1 1(AB012917). If aPP»»<^t eloK ^oi^ 202, 
claims 1-26 and 43 will be searched as they apply to the marker, achaete-scute complex (Drosophila^ homolog-like 1(L08424), and so on 
through C3 and C4 Classes. 

Upon election, please specify the marker to be searched, in addition to it respective group. 

Groups 634-1266. claims 27. 28, 30-33. drawn to a diagnostic array with a nucleic acid based diagnostic agent that is used to assay the 
expression level of a specific marker of lung carcinoma. For example, if Group 634 is elected. Claims 27, 28, 30, 31 , 32 and 33 will 
be searched to the extant that the nucleic acid diagnostic agent will bind to the guanine monophosphate synthetase(U10860) marker(ihe 
first marker listed in the CI Class). Similarly, if Group 834 is elected Claims 27, 28, 30-33 will be searched to the extant that the 
nucleic acid diagnostic agent will bind to the kallikrein 11(AB012917) marker(The first marker listed in the C2 Class). 
Upon election, please specify the marker to be searched, in addition to it respective group. 

Groups 1267-1899 Claims Z7, 29, and 30-33 drawn to a diagnostic array with an antibody tfiat specifically binds to a protein expression 
product of a marker of lung carcinoma. For example, if Group 1267 is elected. Claims 27, 29. 30. 31 . 32. and 33 m\\ be searched to 
the extant that the antibody diagnostic agent will bind to the protein expression product of the, guamne monophosphate 
synthetase(U10860) marker(The first marker listed in the CI Class). Similarly, if Group 1467 is elected Claims 27. 29. 30-33 wiU be 
searched to the extant tiiat the antibody diagnostic agent will bmd to the kallikrein 1 1(AB012917) markerCIlie first marker listed m the 
C2 Class). 

Upon election, please specify tiie marker to be searched, in addition to it respective group. 

Groups 1900-2532 Claims 34-40, drawn to a system and computer disk for maintaining lung cancer marker expression levels, fiirther 
compriang a reference expression level of a single marker in a normal lung and a single marker selected from Tables 1-4. For Maraple. 
if applicant elects Group 1900. then claims 34-40 will be searched to die extant that the marker in tiie system and <fisk is that of the 
guamne monophosphate synthetase(U10860) marker. 

Upon election, please specify die marker to be searched, in addition to it respective group. 

Groups 2533-3164, Claims 41 and 42. drawn to a method for evaluating a drug candidate and for monitoring drug treatment for lung 
cancer by assaying die expression level of a single marker gene from Tables 1-4. Again, for example, if Group 2533 is elected, tiie 
metiiod of claims 41 and 42 will be searched as they apply to the guanine monophosphate synthetase(U10860) marker. 

Applicant should note tiiat each set of groups finds its members in each of the markers in the specification listed as Tables l-4(Classes 
CI-C4) which total to 633 distinct markers. ^ u a ™-r 

The invwitions listed as Groups 1-3164 do not relate to a single general inventive concept under PCT Rule 13.1 because, under PCT rule 
13.2. they lack the same or corresponding special tedmical fcamres for die following reasons: 

The method of group 1, in claim 1, includes classifying lung carcinoma on die basis of gene expression by assaying an expression level 
for each of a plurality of genes in a plurality of lung carcinoma samples in addition to performing a clustenng analysis on the expression 
levels to identify classes of lung carcinoma on the basis of gene expression. Karaian et al.(Oncogene 4/2001) teach die analysis of a 
human lung cancer cell line and its profile of gene expression regulated by p53 at 32 degrees Celsius using DNA microarrays contaimng 
approximately 7000 probes for human genes(abstract). Kannan et al. fiirdier taught cluster analysis of diese data to idenufy classes p53 
regulated and primary targets in die cell line. As the mediod of claims 1-26 and 43 does not represent a contnbution over die prior art, 
die claims lack a special technical feamre of die other claimed inventions. Tlius. diere is no special techiucal feature hnking die recited 
compositions and methods of using said compositions, as would be necessary to fiilfill die requirement for umty of mvention. 

Furdiermore it is also noted diat each of die present claims has been presented in improper Markush format, as distinct mediods, 
diagnostic arrays and distinct systems are improperly joined in die claims. Eatdi mediod. array, and system groupmg con^inses 633 
distinct markers. The markers each consist of a unique nucleotide sequence and differ in dieir strucmral and fimctional properties. 
Additionally, each combination of markers and mediod, array and system is distinct from the other in diat each combination corapnses 
markers of distina structure and as a whole each combination is functionally distinct over each odier. Each mediod mvolvinR. array 
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containing, or system containing combination of markers has a different special technical feature. As the claimed composmons and 
methods using said markers do not share a special technical feature, the disunct compositions and methods may not pro^riy be presented 
in the alternative. Accordingly, the claims have been separated into a number of groups correspondmg to the number of (hfferent 
inventions encompassed by the claims, and die claims will be searched only as they read upon the elected mvention from the inetho^ of 
Groups 1900-2532, which require, for the system and computer disk used for maintaining lung cancer maricer expression levels, different 
pairs of markeri, a single marker from a normal lung and a single marker selected from Tables 1-4. 

Further the dwrned methods of groups 1-633 and 2533-3164 have different objectives, require different process steps and require the 
use of different reagents. The methods of Groups 1-633 require the steps of detecting and subsequently diagnosmg limg carcinoma in a 
partem and recommending treatment, all by assaying the expression level of the same predetermined marker chosen from Tables 1-4(C1- 
C4 markers) the methods of Groups 2533-3164 require die steps of evaluating a drug candidate and for monttonng drug treatment for 
lune cancer* by assaying the expression level of a single marker gene. Each of the mediods of groups 1-633 and 2533-3 164 require die 
use of differwit reagents to accommodate the different tasks and different nucleic acids, i.e. a distinct marker for each group. In adduion 
to differences in objectives, effects, and method steps, it is again noted that the claims of the present Groups are not directed to the 
detection or identification of molecules having die same or common special technical feamre. for the reasons discussed above. 
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